WO2023116259A1 - 辅助模型切分的方法、装置及可读存储介质 - Google Patents

辅助模型切分的方法、装置及可读存储介质 Download PDF

Info

Publication number
WO2023116259A1
WO2023116259A1 PCT/CN2022/131852 CN2022131852W WO2023116259A1 WO 2023116259 A1 WO2023116259 A1 WO 2023116259A1 CN 2022131852 W CN2022131852 W CN 2022131852W WO 2023116259 A1 WO2023116259 A1 WO 2023116259A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
request
reasoning
participate
segmentation
Prior art date
Application number
PCT/CN2022/131852
Other languages
English (en)
French (fr)
Inventor
刘莹莹
段小嫣
Original Assignee
大唐移动通信设备有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大唐移动通信设备有限公司 filed Critical 大唐移动通信设备有限公司
Publication of WO2023116259A1 publication Critical patent/WO2023116259A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present disclosure relates to the field of communication technologies, and in particular, to a method, device and readable storage medium for assisting model segmentation.
  • AI/ML model (hereinafter referred to as AI/ML model), therefore, the current method is to transfer the reasoning of many AI/ML models from the mobile terminal to the cloud or other terminals, that is, it is necessary to transfer the AI/ML model to the cloud or other terminals.
  • the present disclosure provides a method, device, and readable storage medium for assisting model segmentation, which solves the problem that in the prior art, the analysis of model segmentation based on terminal capabilities cannot be realized, and the protection of terminal privacy and network resources cannot be effectively realized. optimization technical issues.
  • the present disclosure provides a method for assisting model segmentation, which is applied to a network entity, and the method includes:
  • the determining the segmentation result of the AI/ML model segmentation according to the first message includes:
  • the to-be-participated The memory that can be provided by UE(s) for model joint reasoning, the remaining power of UE(s) to participate in model joint reasoning, the privacy level of the data set required for reasoning AI/ML models, and/or the delay requirement information of different layers of the model , determining the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning;
  • the determining the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning includes:
  • the computing power that the target UE can provide is higher than the first preset computing power threshold and lower than the second preset computing power threshold
  • the memory that the target UE can provide is higher than the first preset memory threshold and lower than A second preset memory threshold
  • the remaining power of the target UE is higher than the first preset power threshold and lower than the second preset power threshold, determining that the target UE performs a second preset number of inferences, the first preset number of layers is smaller than the second preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the first preset privacy level and lower than the second
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the first preset delay and lower than the second preset delay, and determine that the target UE performs reasoning of the second preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than the third
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the second preset delay and lower than the third preset delay, and determine that the target UE performs reasoning of the third preset number of layers; thereby By analogy, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • the network entity receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis, and according to the computing power available to the UE to participate in model joint reasoning in the first message,
  • the memory that can be provided by the UE to participate in the model joint reasoning, and the remaining power of the UE to participate in the model joint reasoning are used to determine the model segmentation point information of the AI/ML model.
  • the AI/ML model requires the privacy level of the data set And/or the delay of different layers of the model (wherein, if the UE cannot provide the delay requirement information of different layers of the model, the delay requirement information of different layers of the model is provided by AF; if the UE cannot provide the privacy level of the required data set, Confirm whether the privacy level of the data set can be set based on SA3, for example, if SA3 can provide the privacy level setting, the privacy level of the data set can be provided by SA3), and can also be combined with the privacy level of the data set required by the reasoning AI/ML model and/or the delay requirement information of different layers of the model to determine the model segmentation point information of the AI/ML model, as the segmentation result, to realize the joint reasoning of the network entity or the UE to participate in the model based on the segmentation result
  • the joint reasoning operation is performed, therefore, the analysis of model segmentation based on the capabilities of the terminal (that is, UE) is realized, and then the protection of terminal privacy and the optimization of network resources are effectively
  • the parameters carried in the first request include the following At least one item: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs that accept AI/ML model segmentation or any UEs that meet the segmentation conditions, AI/ML model segmentation
  • the parameters carried in the first request send a second message to the 5GC NF(s), the second message is used to request the 5GC NF(s) to collect first data corresponding to the UE(s), and the first
  • the data includes at least one of the following items: UE(s) to participate in AI/ML model segmentation and SUPI, computing power available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning ) available memory, the remaining power of UE(s) to participate in model joint reasoning, and the size of the AI/ML model; if the parameters carried in the first request include the privacy level and/or or the delay requirement information of different layers of the model, the first data also includes: the privacy level of the dataset required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the method further includes:
  • the delay requirement information of different layers of the model is provided by the AF.
  • the method further includes:
  • the seventh request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to all UE(s) through AF. Describe UE(s).
  • the UE when the UE and the NWDAF jointly infer models, the UE reports its own capability to the NWDAF through the AF, and at the same time requests the NWDAF to judge the model segmentation point, and the NWDAF feeds back the judgment result to the UE through the AF.
  • a ninth request is sent to the network data analysis function NWDAF, and the parameters carried in the ninth request include the first The parameters carried in the eighth request, and the ninth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power and available computing power provided by other UE(s) that can participate in model reasoning Memory, remaining power, if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the ninth request also include: other UEs that can participate in model reasoning (s ) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the ninth request result sent by NWDAF, and use the ninth request result and the parameters carried in the eighth request as the first message; wherein the ninth request result includes other UEs that can participate in model reasoning (s) and other UE(s) that can participate in model inference can provide computing power, available memory, and remaining power. Delay, the ninth request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the result of determining whether the UE(s) to participate in model joint reasoning has the ability to fully support the execution of model joint reasoning, the relevant information of the UE(s) to participate in model joint reasoning, and the UE(s) to participate in model joint reasoning
  • the model segmentation point information corresponding to the reasoned UE(s) is sent to the UE;
  • the UE when the UE and the AF jointly reason about the model, the UE reports its own capabilities to the AF, and the AF performs joint reasoning and judgment of model segmentation points, and then feeds back the judgment result to the UE.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • the method further includes:
  • the AF when the UE and the AF jointly infer the model, the AF requests the new network entity MMF to perform model segmentation, and the MMF requests the NWDAF to collect the capabilities of the UE, and the MMF judges the model segmentation point based on the received analysis results , and feed back the result to the AF, and the AF initiates a joint reasoning request to the relevant UE.
  • the UE(s) to be involved in model joint reasoning has the ability to fully support model joint reasoning, use the parameters carried in the fifteenth request as the first message;
  • the UE(s) to participate in model joint reasoning does not have the ability to fully support model joint reasoning, send a sixteenth request to NWDAF, and the parameters carried in the sixteenth request include the fifteenth The parameters carried in the request, and the sixteenth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power that can be provided by other UE(s) that can participate in model reasoning.
  • the parameters carried in the sixteenth request also include: other UEs that can participate in model reasoning ( s) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the sixteenth request result includes other participating Computing power, available memory, and remaining power available to UE(s) for model reasoning and other UE(s) that can participate in model reasoning, if the privacy level of the data set and/or model is required for AI/ML model reasoning Latency of different layers, the sixteenth request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model.
  • the fifteenth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE(s) to be involved in model joint reasoning.
  • UE when UE and AF jointly infer models, UE reports its own capabilities to AF to request model joint inference, AF transmits its own and UE capabilities to MMF, requests MMF to perform model segmentation, and MMF judges if other UEs ( s) Participate in model joint reasoning, then MMF requests NWDAF to collect information of other UE(s), MMF performs model segmentation based on the collected information and feeds back the results to AF.
  • the seventeenth request is used to request AI/ML model segmentation analysis;
  • the parameters carried in the seventeenth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or any UEs that meet the analysis conditions, and the identification of AI/ML model segmentation Area, size of AI/ML model, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning Remaining power; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the seventeenth request also include: the data set required for reasoning the AI/ML model Information on privacy levels and/or latency requirements for different layers of the model;
  • the ninth message includes the AI/ML model segmentation point and the data of the first message; the ninth message is used to provide the AF with joint reasoning for the model to be involved
  • the parameters carried when the UE(s) sends the nineteenth request, the nineteenth request is used to request the UE(s) to perform model joint reasoning with the UE(s) to participate in the model joint reasoning operation, and the nineteenth request carried Parameters include said ninth message.
  • the AF when the UE and the AF jointly infer the model, the AF requests the PCF to judge the model segmentation strategy, the PCF requests the NWDAF to collect the capabilities of the UE, and the PCF judges the model segmentation point based on the received analysis results. And feed back the results to the AF, and the AF initiates a joint reasoning request to the relevant UE.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • the twentieth request is used to request AI/ML model segmentation analysis
  • the parameters carried in the twentieth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, the The remaining power of the UE(s) for model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the twentieth request also include: reasoning AI /The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model;
  • the AF and the UE(s) to participate in model joint reasoning do not have the ability to fully support model joint reasoning, send a 21st request to NWDAF, and the parameters carried in the 21st request include the The parameters carried in the 20th request, and the 21st request is used to request NWDAF to find other UE(s) that can participate in model reasoning and the computing power that can be provided by other UE(s) that can participate in model reasoning through NF , the available memory, and the remaining power.
  • the 21st request result includes : Other UE(s) that can participate in model inference and the computing power, available memory, and remaining power available to other UE(s) that can participate in model inference. If AI/ML models need to be inferred, the privacy level of the data set is required and/or the time delay of different layers of the model, the twenty-first request result also includes: other UE(s) that can participate in model reasoning to infer the privacy level of the data set required by the AI/ML model and/or the different layers of the model Latency Requirement Information
  • the delay requirement information of different layers of the model is provided by the AF.
  • UE when UE and AF jointly infer models, UE reports its own capabilities to AF to request model joint inference, AF sends itself and UE capabilities to PCF, requests PCF to perform model segmentation, and PCF judges if other UEs ( s) Participate in model joint reasoning, then PCF requests NWDAF to collect information of other UE(s), PCF performs model segmentation based on the collected information and feeds back the results to AF.
  • the present disclosure provides a method for assisting model segmentation, the method is applied to a user equipment UE, and the method includes:
  • the determination of AI/ML model segmentation point information according to its own capability information includes:
  • the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, then according to the available computing power, available memory, remaining power, inference AI/ML in the self-capability information, The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model determine the model segmentation point information.
  • the computing power that the target UE can provide is lower than the first preset computing power threshold or the memory that the target UE can provide is lower than the first preset memory threshold or the remaining power of the target UE is lower than the first preset
  • the battery power threshold it is determined that the target UE performs a first preset number of layers of reasoning
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than the third
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the second preset delay and lower than the third preset delay, and determine that the target UE performs reasoning of the third preset number of layers; thereby By analogy, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • the model segmentation point information is determined according to the available computing power, available memory, and remaining power in the self-capability information; if the privacy level of the data set and/or The time delay of different layers of the model can also be combined with the privacy level of the dataset required for reasoning AI/ML models in its own capability information and/or the time delay requirements information of different layers of the model to determine the model segmentation of the AI/ML model
  • the point information, as the segmentation result is used to realize the joint reasoning operation of the network entity or the UE to participate in the joint reasoning of the model based on the segmentation result. Therefore, the analysis of the model segmentation based on the terminal (ie UE) capability is realized. Furthermore, the protection of terminal privacy and the optimization of network resources are effectively realized.
  • the UE judges the model segmentation point based on its own capability, reports the model segmentation point (model segmentation ratio) to the AF, and performs joint inference interaction.
  • the AF sends a second request to the network data analysis function NWDAF, and the second request is used to request a model joint reasoning operation with the NWDAF;
  • the parameters carried in the second request include at least one of the following AI/ ML model segmentation point, analysis type identification associated with model identification or model segmentation identification, identification of a user equipment UE or a group of UEs that receive AI/ML model segmentation or any UEs that meet the analysis conditions, AI/ML model The segmented area, the size of the AI/ML model, the computing power available to the UE to participate in model joint reasoning, the memory available to the UE to participate in model joint reasoning, and the remaining power of the UE to participate in model joint reasoning;
  • the privacy level of the data set and/or the time delay of different layers of the model are required for the AI/ML model, and the parameters carried in the second request also include: the privacy level of the data set required for reasoning the AI/ML model and/or the model is different Layer delay requirement information;
  • Receive a second request result sent by NWDAF the second request result is determined by NWDAF according to the parameters carried in the first request, and the second request result includes accepting the first request or not accepting the first request;
  • the present disclosure provides a device for assisting model segmentation, the device is applied to a network entity, and the device includes a memory, a transceiver, and a processor:
  • a segmentation result of AI/ML model segmentation is determined.
  • a receiving unit configured to receive the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis
  • the determination unit is used to determine the segmentation point information of the AI/ML model according to its own capability information
  • a processing unit configured to use the AI/ML model segmentation point information as a segmentation result of the AI/ML model segmentation.
  • the present disclosure provides a processor-readable storage medium, the processor-readable storage medium stores a computer program, and the computer program is used to enable the processor to execute any one of the first aspect or the second aspect. method described in the item.
  • the model segmentation point information of the AI/ML model is used as the segmentation result to realize the network entity or the UE to participate in the joint reasoning of the model to perform joint reasoning operations on the model based on the segmentation result. Therefore, terminal-based (ie, UE ) ability to analyze model segmentation, and then effectively realize the protection of terminal privacy and the optimization of network resources.
  • FIG. 2 is a first schematic flowchart of a method for assisting model segmentation provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of a third signaling flow of the method for assisting model segmentation when the network entity is NWDAF according to Embodiment 1 of the present disclosure
  • FIG. 6 is a schematic diagram of a fourth signaling flow of the method for assisting model segmentation when the network entity is NWDAF according to Embodiment 1 of the present disclosure
  • FIG. 7 is a schematic diagram of the first signaling flow of the method for assisting model segmentation when the network entity is an AF provided in Embodiment 2 of the present disclosure
  • FIG. 8 is a schematic diagram of the second signaling flow of the method for assisting model segmentation when the network entity is an AF provided in Embodiment 2 of the present disclosure
  • FIG. 9 is a schematic diagram of a first signaling flow of a method for assisting model segmentation when the network entity is an MMF provided in Embodiment 3 of the present disclosure.
  • FIG. 10 is a schematic diagram of a second signaling flow of the method for assisting model segmentation when the network entity is an MMF provided in Embodiment 3 of the present disclosure
  • FIG. 12 is a schematic diagram of the second signaling flow of the method for assisting model segmentation when the network entity is a PCF provided by Embodiment 4 of the present disclosure
  • FIG. 13 is a second schematic flowchart of the method for assisting model segmentation provided by an embodiment of the present disclosure
  • FIG. 15 is a schematic structural diagram of a device for assisting model segmentation provided by another embodiment of the present disclosure.
  • FIG. 17 is a schematic structural diagram of an apparatus for assisting model segmentation provided by another embodiment of the present disclosure.
  • AI/ML model users for example, application client running on the UE, that is, the application client runs on the UE
  • AI/ML model provider for example, application function (English: Application Function, abbreviated as: AF)
  • the network entity receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis, And according to the computing power that can be provided by UE(s) to participate in model joint reasoning in the first message, the memory that can be provided by UE(s) to participate in model joint reasoning, and the remaining The electricity is used to determine the model segmentation point information of the AI/ML model. If the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, it can also be combined with the data required for inferring the AI/ML model.
  • the network data analysis function (English: Network data analytic function , referred to as: NWDAF) is a network analysis function managed by operators.
  • NWDAF can provide network functions (English: Network Function, referred to as: NF) in the 5G core network (English: 5G Core Network, referred to as: 5GC) (ie NF(s)), application function (English: Application Function, abbreviated as: AF) and operation management and maintenance (English: Operation Administration and Maintenance, abbreviated as: OAM) provide data analysis services.
  • the analysis result may be historical statistical information or forecast information.
  • NWDAF can serve one or more network slices.
  • NWDAF instances There can be different NWDAF instances in 5GC providing different types of dedicated analysis.
  • the NWDAF instance needs to provide its support Analytic ID when registering with the network database function (English: Network Repository Function, abbreviated as: NRF).
  • the analysis type identification (so that the consumer NF can provide the Analytic ID to indicate what type of analysis is required when querying the NWDAF instance to the NRF.
  • 5GC network function and OAM decide how to use the data analysis provided by the network data analysis function NWDAF to improve the network performance.
  • the network entity or terminal is based on information such as terminal power, available memory, and available computing power, as well as the delay requirements of this model and the reasoning required by this model
  • the privacy level of the data set, etc. select the model segmentation point, and the network entity or terminal sends the model segmentation point information to the network entity or terminal participating in the joint reasoning of the model.
  • the available computing power may be the remaining storage of the terminal.
  • the network entity or terminal determines the model segmentation point information for segmenting the AI/ML model based on the computing power available from the terminal, the memory available from the terminal, and the remaining power of the terminal.
  • the privacy level and/or the time delay of different layers of the model can also be combined with the privacy level of the data set required for reasoning AI/ML models and/or the time delay requirements information of different layers of the model to determine the model segmentation of the AI/ML model Point information, as the segmentation result, is used to realize the joint reasoning operation of the model based on the segmentation result by the network entity or the terminal. Therefore, the analysis of the model segmentation based on the capability of the terminal (ie UE(s)) is realized, and then effectively Realize the protection of terminal privacy and the optimization of network resources.
  • the UE(s) may be one UE or multiple UEs (or a group of UEs or any UEs), and the number of UEs may be determined according to specific scenarios.
  • each UE reports, and the parameters reported are the UE's own parameters. Parameters (such as the computing power that the UE can provide, the memory that can be provided, and the remaining power); for the receiving side, the parameters reported by each UE are received, that is, the parameters reported by each UE are received (in combination, The received parameters include computing power available to each UE (or UE(s), available memory, remaining power).
  • the first message includes at least one of the following items: the UE(s) to participate in AI/ML model segmentation or the user permanent identification SUPI, the identification of the application using the AI/ML model, the model union to participate Computing power provided by UE(s) for reasoning, memory available for UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, size of AI/ML model;
  • the first message also includes: the privacy level of the data set and/or different layers of the model required for inference of the AI/ML model The delay requirement information.
  • Step 102 the network entity determines the segmentation result of the AI/ML model segmentation according to the first message.
  • determining the segmentation result of the AI/ML model segmentation may be implemented through the following steps:
  • Step a1 according to the computing power available to the UE(s) to participate in model joint reasoning in the first message, the memory available to the UE(s) to participate in model joint reasoning, the UE(s) to participate in model joint reasoning ( s) the remaining power, determining the model segmentation point information corresponding to the UE(s) to participate in the model joint reasoning;
  • Step a2 If the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, then according to the computing power that can be provided by the UE(s) to participate in the joint reasoning of the model in the first message , the memory available to the UE(s) to participate in model joint reasoning, the remaining power of the UE(s) to participate in model joint reasoning, the privacy level of the data set required for reasoning AI/ML models and/or the time of different layers of the model Extending the demand information, determining the model segmentation point information corresponding to the UE(s) to be involved in model joint reasoning;
  • Step a3 using the model segmentation point information as the segmentation result.
  • the network entity can select the model cutout based on information such as the terminal's power, available memory, and available computing power, as well as the delay requirements of the model and the privacy level of the inference data set required by the model. point (that is, model segmentation point information or AI/ML model segmentation point information), and then the network entity sends the model segmentation point information to the network entity or terminal participating in the model joint reasoning.
  • point that is, model segmentation point information or AI/ML model segmentation point information
  • the determination of the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning may be achieved by the following steps:
  • Step b2 if the computing power that the target UE can provide is lower than the first preset computing power threshold or the memory that the target UE can provide is lower than the first preset memory threshold or the remaining power of the target UE is lower than the first preset memory threshold
  • a preset power threshold is reached, determine that the target UE performs a first preset number of layers of reasoning
  • Step b4 if the computing power available to the target UE is higher than the second preset computing power threshold and lower than the third preset computing power threshold, and the memory available to the target UE is higher than the second preset memory threshold and is lower than the third preset memory threshold, and when the remaining power of the target UE is higher than the second preset power threshold and lower than the third preset power threshold, it is determined that the target UE executes the third preset number of layers Reasoning, the second preset number of layers is less than the third preset number of layers; and so on, until it is determined that the target UE performs the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers set the number of layers;
  • Step b7 if the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than At the third preset privacy level, the delay requirement information of different layers of the model is higher than the second preset delay and lower than the third preset delay, and it is determined that the target UE performs reasoning of a third preset number of layers ; and so on, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • Step b8 Determine the AI/ML model segmentation point information according to the preset number of layers, the AI/ML model segmentation point information is used to indicate the AI/ML model segmentation ratio, and the preset number of layers includes the Nth preset Number of layers, N greater than or equal to one.
  • the network entity receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis, and according to the computing power provided by the UE(s) to participate in model joint reasoning in the first message
  • the power, the memory available to the UE(s) to participate in the model joint reasoning, and the remaining power of the UE(s) to participate in the model joint reasoning are used to determine the model segmentation point information for splitting the AI/ML model. If the reasoning AI/ML
  • the privacy level of the data set and/or the time delay of different layers of the model are required for the model, and the split AI/ML model can also be determined by combining the privacy level of the data set required by the inference AI/ML model and/or the time delay requirements information of different layers of the model.
  • the model segmentation point information of the ML model, as the segmentation result, is used to realize the joint reasoning of the network (AF) and the UE(s) to participate in the model based on the segmentation result to perform joint reasoning on the model.
  • UE) capabilities analyze the model segmentation, and then effectively realize the protection of terminal privacy and the optimization of network resources.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis may include the following steps:
  • Step c11 receiving the first request sent by the application function AF directly or through the network capability exposure function NEF, the first request is used to request to analyze the segmentation of the AI/ML model; wherein, the parameters carried in the first request Including at least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs that accept AI/ML model segmentation or any UEs that meet the segmentation conditions, AI/ML The area where the ML model is segmented, the size of the AI/ML model, the computing power available to the UE(s) to participate in the model joint reasoning, the memory available to the UE(s) to participate in the model joint reasoning, and the model to participate in the joint reasoning The remaining power of the UE(s); if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the parameters carried in the first request also include: The privacy level of the data set and/or the delay requirement information of
  • Step c12 according to the parameters carried in the first request, send a second message to the 5GC NF(s), the second message is used to request the 5GC NF(s) to collect the first data corresponding to the UE(s), so
  • the first data includes at least one of the following: UE(s) to participate in AI/ML model segmentation and SUPI, computing power available to UE(s) to participate in model joint reasoning, model joint reasoning to be participated in The available memory of the UE(s), the remaining power of the UE(s) to participate in the joint inference of the model, and the size of the AI/ML model; if the parameters carried in the first request include the data set required for reasoning the AI/ML model
  • the privacy level and/or delay requirement information of different layers of the model the first data also includes: the privacy level of the data set required for reasoning the AI/ML model and/or the delay requirement information of different layers of the model;
  • the method further includes:
  • FIG. 3 is a schematic flowchart of a first signaling flow of a method for assisting model segmentation when the network entity is an NWDAF according to Embodiment 1 of the present disclosure.
  • the specific steps are: (301a is that the AF is in the trusted zone, 301b and 301c are that the AF is in the untrusted zone; step 305a is that the AF is in the trusted zone, 305b and 305c are that the AF is in the untrusted zone.)
  • NWDAF Nnwdaf_MLModelSplit_Request
  • NWDAF requests NWDAF to collect UE (i.e. UE to be involved in model joint reasoning) (s))
  • step 301b the AF sends a Nnef_MLModelSplit_Request (that is, a machine learning model split request) to the NEF, and the request includes the information in Table 1 above.
  • a Nnef_MLModelSplit_Request that is, a machine learning model split request
  • Step 301c after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, and the request includes the information in Table 1 above.
  • NWDAF calls Nnf_EventExposure_Subscribe (that is, event open subscription) to collect the available computing power, available memory, and remaining power of UE(s) from 5GC NF(s) (for example, AMF/SMF), (if any) Infer the privacy level of the data set required by this model, and (if provided) the delay requirements of different layers of the model; 5GC NF(s) (for example, AMF/SMF) send these request information to the terminal, and the terminal receives this request , (if the terminal passes this request), it prepares the power, memory, computing power corresponding to the Analytics ID (Model ID), (if any) the privacy level of the data set and other information feedback (see Table 2) to NF (for example, AMF /SMF).
  • 5GC NF(s) for example, AMF/SMF
  • step 303 5GC NF(s) calls Nnf_EventExposure_Notify (event exposure notification) to feed back required data to NWDAF.
  • Step 304 NWDAF performs analysis, and selects model cutouts based on the UE's available computing power, available memory, remaining power, and the privacy level of the data set required for inferring this model, as well as the delay requirements of different layers of the model.
  • Split point or model split ratio
  • the computing power and memory available to the UE when the computing power and memory available to the UE are low, and the remaining power is low, it can only do a few layers of reasoning.
  • the AI/ML model requires the privacy level of the data set and/or the different layers of the model Latency, and the privacy level requirements are high and/or the delay requirements of different layers of the model are low, and fewer layers of reasoning can be done; if the UE's available computing power and memory are sufficient, and the remaining power is high, you can do More layers of reasoning, if the privacy level of the dataset and/or the delay of different layers of the model are required when inferring the AI/ML model, and the privacy level requirements are low and/or the delay requirements of different layers of the model are high, compare Multi-layer reasoning; for example, suppose that for an 8-layer ML model, UE1 can do the reasoning of the first two layers, UE2 can do the reasoning of the middle three layers, and AF can do the reasoning of the last three layers, that is, UE1 corresponds to the model
  • model split point or model split ratio
  • Nnwdaf_MLModelSplit_Request Response ie, the response to the machine learning model split request
  • Step 305c after NEF authorizes, send Nnef_MLModelSplit_Request response (response to machine learning model splitting request) to NWDAF, the content includes information such as the above-mentioned model splitting point (or model splitting ratio).
  • step 307 the AF sends a model joint inference request Naf_MLModelJointInference_Request (namely, a model joint inference request) to the relevant UE(s), and the request includes information on model segmentation points (or model segmentation ratios).
  • a model joint inference request Naf_MLModelJointInference_Request namely, a model joint inference request
  • the request includes information on model segmentation points (or model segmentation ratios).
  • Step 308 UE(s) sends the response Naf_MLModelJointInference_Request response of the model joint inference request (that is, the response of the model joint inference request) to the AF, indicating whether to accept the model joint inference request.
  • Scenario 12 When UE(s) and AF(s) jointly infer (multiple) models, UE reports its own capabilities to AF, AF requests NWDAF to judge the model segmentation point, NWDAF feeds back the judgment result to AF, AF sent to the UE.
  • receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis may include the following steps:
  • the delay requirement information of different layers of the model is provided by the AF.
  • Analytics ID MLModelSplit associated with Model ID
  • UE's available computing power available memory, remaining power, and the privacy level of the data set required for reasoning this model
  • information such as the delay requirements of different layers of the model
  • UE1 can do the reasoning of the first two layers
  • UE2 can do the reasoning of the middle three layers
  • AF can do the reasoning of the last three layers, that is, UE1 corresponds to model segmentation point 1, and UE2 corresponds to model segmentation Point 2.
  • Step 405b NWDAF sends Nnwdaf_MLModelSplit_Request Response to NEF, the content includes the above-mentioned model split point (or model split ratio).
  • Step c31 receiving the fifth request sent by the application function AF directly or through the network capability exposure function NEF, the fifth request is determined by the AF and the UE(s) to participate in the model joint reasoning does not fully support the execution of the model joint reasoning It is determined after the capability, wherein whether it has the ability to fully support the execution of model joint reasoning is determined by the AF according to the received sixth request sent by the UE(s) to participate in the model joint reasoning, and the fifth request is used to request Analyze the AI/ML model segmentation and request to find other UE(s) that can participate in model reasoning and provide the computing power, available memory, and remaining power available to other UE(s) that can participate in model reasoning; among them , the parameters carried in the fifth request include at least one of the following: an analysis type identifier associated with model segmentation, an identifier of a user equipment UE or a group of UEs accepting AI/ML model segmentation, or satisfying segmentation conditions Any UEs in the AI/ML model segmentation area, the size of the AI
  • the remaining power of the UE(s) to participate in the joint reasoning of the model if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the fifth request also include: Reasoning the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model; the sixth request is used for the UE(s) to participate in the model joint reasoning request to perform model joint reasoning with the AF, so The parameters carried in the sixth request include the parameters carried in the fifth request;
  • Step c32 according to the parameter carried in the fifth request, send a fifth message to the 5GC NF(s), the fifth message is used to request the 5GC NF(s) to collect the second data corresponding to the UE(s), so
  • the second data includes at least one of the following items: the UE(s) or SUPI to participate in AI/ML model segmentation, the computing power that can be provided by the UE(s) to participate in model joint reasoning, and the computing power available to participate in model joint reasoning
  • the memory that UE(s) can provide the remaining power of UE(s) to participate in the joint inference of the model, and the size of the AI/ML model; if the parameters carried in the fifth request include the privacy of the data set required for inferring the AI/ML model Level and/or delay requirement information of different layers of the model, the second data also includes: privacy level of the data set required for inference AI/ML model and/or delay requirement information of different layers of the model;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the method further includes:
  • the sixth message includes at least the AI/ML model segmentation point information and the second data; the sixth message is used to provide the AF with joint reasoning for the model to be participated
  • the sixth request result sent by the UE(s), the sixth request result includes the AI/ML model segmentation point information corresponding to the UE(s) to participate in model joint reasoning, the other participating UE(s) for model reasoning and the corresponding segmentation point information of the AI/ML model.
  • FIG. 5 is a schematic flowchart of a third signaling flow of a method for assisting model segmentation when the network entity is an NWDAF according to Embodiment 1 of the present disclosure.
  • the specific steps are: (step 503a is that the AF is in the trusted area, steps 503b and 503c are that the AF is in the untrusted area; step 507a is that the AF is in the trusted area, and steps 507b and 507c are that the AF is in the untrusted area.)
  • Steps 500-501 are the same as those described for steps 400-401 in the embodiment of Scenario 12. (Right now:
  • step 500 the AF establishes a connection with the UE.
  • the privacy level of the data set is required for the AI/ML model, and the request also includes the privacy level of the data set required for reasoning the AI/ML model. If not, check whether it can be set based on SA3; if there is a reasoning AI/ML model, it is required
  • the delay requirement information of different layers of the model the request also includes the delay requirement information of different layers of the model, if not, it will be provided by AF.
  • the AF judges the joint inference request, and if it judges that the UE and the AF are not capable enough to perform joint inference, it considers that other UE(s) are needed to help perform model segmentation.
  • Analytics ID MLModelSplit associated with Model ID
  • UE's available computing power available memory, remaining power
  • the privacy level of the data set required for reasoning this model and information such as the delay requirements of different layers of the model
  • request NWDAF to help discover other UE(s) that can participate in model reasoning and their related available computing power, available memory, remaining power, etc.
  • step 503b the AF sends a Nnef_MLModelSplit_Request to the NEF, and the request includes the above information.
  • Step 503c after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, the request includes the above information.
  • Steps Step 504-Step 505 are described as Steps 302-303 in the embodiment of Scenario 11 (ie:
  • Step 504 NWDAF calls Nnf_EventExposure_Subscribe to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), and (if any) reason the data required by this model
  • 5GC NF(s) for example, AMF/SMF
  • the 5GC NF(s) sends these request information to the terminal, and the terminal receives this request, (if passed this request ) to feed back information such as power, memory, computing power corresponding to the Analytics ID (Model ID), and (if any) the privacy level of the data set to the NF (for example, AMF/SMF).
  • Step 505 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF).
  • Step 506 NWDAF performs analysis, and based on the UE(s) in the list that can participate in model segmentation in the output data, and their corresponding available computing power, available memory, remaining power, and reasoning
  • Step 507 (ie steps 507a-507c) is the same as described in step 305 (ie steps 305a-305c) in the embodiment of scenario 11 (ie:
  • Step 507a send information such as model split point (or model split ratio) to AF through Nnwdaf_MLModelSplit_Request Response, as shown in Table 3.
  • Step 507b NWDAF sends Nnwdaf_MLModelSplit_Request Response to NEF, the content includes information such as the above-mentioned model split point (or model split ratio).
  • Step 507c after NEF authorizes, send Nnef_MLModelSplit_Request response to NWDAF, the content includes information such as the above-mentioned model split point (or model split ratio) etc.).
  • Step 508 AF sends the response Naf_MLModelJointInference_Request response of the model joint inference request to the UE, and the response includes the model segmentation point (or model segmentation ratio) and other UE(s) that can participate in the model inference and its related model segmentation points (or model split ratio) information.
  • Scenario 14 When UE and NWDAF jointly infer models, UE reports its own capability to NWDAF through AF, and requests NWDAF to judge the model segmentation point at the same time, and NWDAF feeds back the judgment result to UE through AF.
  • the receiving the first message for requesting artificial intelligence/machine learning AI/ML model segmentation analysis may include the following steps:
  • Step c41 Receive the seventh request sent by the UE(s) to participate in model joint reasoning, the seventh request is used to request to perform model joint reasoning operation with NWDAF; the parameters carried in the seventh request include at least the following One item: the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, the remaining power of UE(s) to participate in model joint reasoning; if reasoning The privacy level of the data set and/or the time delay of different layers of the model are required for the AI/ML model, and the parameters carried in the seventh request also include: the privacy level of the data set required for reasoning the AI/ML model and/or the model is different The delay requirement information of the layer; according to the parameters carried in the seventh request, determine whether the AF and the UE(s) to be involved in the model joint reasoning have the ability to fully support the execution of the model joint reasoning;
  • Step c42 if the UE(s) to be involved in model joint reasoning has the ability to fully support the execution of model joint reasoning, then use the parameters carried in the seventh request as the first message to participate in the AI/ML model Segmented UE(s) UE(s) to participate in model joint reasoning UE(s) to participate in model joint reasoning UE(s) to participate in model joint reasoning UE(s);
  • Step c43 If the UE(s) to participate in model joint reasoning does not have the ability to fully support model joint reasoning, request NF to find other UE(s) that can participate in model reasoning and provide other models that can participate in model reasoning. Computing power, available memory, and remaining power of UE(s), and if the privacy level of the data set and/or the delay of different layers of the model are required for inference of the AI/ML model, other participating models are requested.
  • the inference UE(s) infers the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • Step c44 Receive the search results sent by the NF, the search results include other UE(s) that can participate in model reasoning and the computing power, available memory, and remaining power that can be provided by other UE(s) that can participate in model reasoning , if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the search results also include: other UE(s) that can participate in model inference to infer the data set required by the AI/ML model Information on privacy levels and/or latency requirements for different layers of the model;
  • Step c45 using the parameters carried in the seventh request and the search result as the first message
  • the delay requirement information of different layers of the model is provided by the AF.
  • the seventh request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to all UE(s) through AF. Describe the UE.
  • FIG. 6 is a schematic diagram of a fourth signaling flow of a method for assisting model segmentation when the network entity is an NWDAF according to Embodiment 1 of the present disclosure. The specific steps are:
  • NWDAF Nnwdaf_MLModelJointInference_Request i.e., model joint inference request
  • the NF discovers other UE(s) that can participate in model reasoning and their related available computing power, available memory, remaining power, etc.
  • NWDAF calls Nnf_EventExposure_Subscribe to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), and (if any) reason the data required for this model
  • 5GC NF(s) for example, AMF/SMF
  • the 5GC NF(s) sends these request information to the terminal, and the terminal receives this request, (if passed this request ) to feed back information such as power, memory, computing power corresponding to the Analytics ID (Model ID), and (if any) the privacy level of the data set to the NF (for example, AMF/SMF).
  • Step 604 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF. )
  • Step 605 NWDAF performs analysis based on the collected information and its own model, and judges the model segmentation point (or model segmentation ratio).
  • Step 606 NWDAF feeds back the response Nnwdaf_MLModelJointInference_Request response of the AI/ML model joint inference request (that is, the response to the model joint inference request) to the UE.
  • the request contains information on model segmentation points (or model segmentation ratios). If there are other participating joint models
  • the reasoned UE(s) contains the model segmentation point (or model segmentation ratio) information of the corresponding UE(s), which is transparently transmitted to the UE through AF.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis may include the following steps:
  • Step d14 Receive the ninth request result sent by NWDAF, and use the ninth request result and the parameters carried in the eighth request as the first message; wherein, the ninth request result includes other participating models Computing power, available memory, and remaining power available for inference UE(s) and other UE(s) that can participate in model inference, if the privacy level of the data set and/or the model is different when inferring AI/ML models
  • the delay of the layer, the ninth request result also includes: other UE(s) that can participate in model reasoning can reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the method further includes:
  • the result of determining whether the UE(s) to participate in model joint reasoning has the ability to fully support the execution of model joint reasoning, the relevant information of the UE(s) to participate in model joint reasoning, and the UE(s) to participate in model joint reasoning
  • the model segmentation point information corresponding to the reasoned UE(s) is sent to the UE;
  • the relevant information includes at least one of the following items: computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, memory available to participate in model joint reasoning The remaining power of the UE(s); if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the relevant information also includes: the privacy of the data set required for inferring the AI/ML model Latency requirements information for different layers of the class and/or model.
  • FIG. 7 is a schematic flowchart of a first signaling flow of a method for assisting model segmentation when the network entity is an AF according to Embodiment 2 of the present disclosure.
  • the specific steps are: (in step 703a, the AF is in the trusted zone; in 703b and 703c, the AF is in the untrusted zone; in step 706a, the AF is in the trusted zone, and in 706b and 706c, the AF is in the untrusted zone.)
  • the AI/ML model requires the privacy level of the data set, and the request also includes the privacy level of the data set required for reasoning the AI/ML model. If not, check whether it can be set based on SA3; (if there is an AI/ML model for reasoning If the delay requirement information of different layers of the model is required, the request also includes the delay requirement information of different layers of the model, if not, it will be provided by AF.
  • step 702 the AF judges the joint inference request. If other UE(s) are required to participate in the joint inference model, execute steps 703 (ie, steps 703a-703c)-706, otherwise skip.
  • Steps 703 ie steps 703a-703c)-706 (ie steps 706a-706c) are as described in steps 503 (ie steps 503a-503c)-505 and step 507 (ie steps 507a-507c) in the embodiment of scenario 13 ( Namely: 703, AF sends Nnwdaf_AnalyticsSubscription_Subscribe (ie analysis subscription subscription) to NWDAF. Implementation process:
  • Analytics ID MLModelSplit associated with Model ID
  • UE's available computing power available memory, remaining power
  • the privacy level of the data set required for reasoning this model and information such as the delay requirements of different layers of the model
  • request NWDAF to help discover other UE(s) that can participate in model reasoning and their related available computing power, available memory, remaining power, etc.
  • step 703b the AF sends a Nnef_MLModelSplit_Request to the NEF, and the request includes the information in the above step 703a.
  • Step 703c after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, and the request includes the above step 703a information.
  • Step 704 NWDAF calls Nnf_EventExposure_Subscribe to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), and (if any) reason the data required by this model
  • 5GC NF(s) for example, AMF/SMF
  • the 5GC NF(s) sends these request information to the terminal, and the terminal receives this request, (if passed this request ) to feed back information such as power, memory, computing power corresponding to the Analytics ID (Model ID), and (if any) the privacy level of the data set to the NF (for example, AMF/SMF).
  • Step 705 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF).
  • Step 706 (that is, steps 706a-706c) is as described in step 507 (that is, steps 507a-507c) in the embodiment of scenario 13 (that is, step 706, NWDAF sends Nnwdaf_AnalyticsSubscription_Notify (that is, analysis subscription notification) to AF.
  • NWDAF sends Nnwdaf_AnalyticsSubscription_Notify (that is, analysis subscription notification) to AF.
  • NWDAF sends Nnwdaf_MLModelSplit_Response to NEF, the content of which includes information such as the above-mentioned model split point (or model split ratio).
  • step 707 the AF judges the model segmentation point (model segmentation ratio) of the relevant UE based on the received information and its own model and capability.
  • Step 708 AF sends the judgment result, related UE information and corresponding model segmentation point (model segmentation ratio) information to UE through Nnf_MLModelJointInference_Request response.
  • This information can also implement joint model reasoning between UE and UE.
  • the AF requests the NWDAF to collect the capabilities of the UE.
  • the AF judges the model segmentation point based on the received analysis results, and initiates a joint reasoning request to the relevant UE.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • the remaining power of the UE(s) to participate in the joint reasoning of the model if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the tenth request also include: Reasoning the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • Step d22 Receive the tenth request result sent by NWDAF, the tenth request result includes at least one of the following items: the UE(s) or SUPI to participate in the AI/ML model segmentation, and the identification of the application using the AI/ML model , Computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, AI/ML model If the parameters carried in the tenth request include the privacy level of the data set required for inference AI/ML model and/or the delay requirement information of different layers of the model, then the tenth request result also includes: inference AI/ML model The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model;
  • the method further includes:
  • Step d24 Send an eleventh request to the UE(s) to participate in model joint reasoning, the eleventh request is used to request the UE(s) to perform model joint reasoning operation; wherein, The parameters carried in the eleventh request include at least one of the following: model segmentation point information, the first message;
  • Step d25 Receive the eleventh request result sent by the UE(s) to participate in model joint reasoning, the eleventh request result is based on the parameters carried in the eleventh request by the UE(s) to participate in model joint reasoning If determined, the eleventh request result includes accepting the eleventh request or not accepting the eleventh request.
  • FIG. 8 is a schematic diagram of a second signaling flow of a method for assisting model segmentation when the network entity is an AF provided in Embodiment 2 of the present disclosure.
  • the specific steps are: (step 801a is that the AF is in the trusted area, steps 801b and 801c are that the AF is in the untrusted area; step 804a is that the AF is in the trusted area, and steps 804b and 804c are that the AF is in the untrusted area.)
  • Step 801 ie 801a-801c)-804 (ie 804a-804c)
  • the AF requests the NWDAF to collect the capabilities of the UE and obtain the analysis results as in steps 301 (ie 301a-301c)-303, 305 in the embodiment of Scenario 11 (ie 305a-305c) (ie:
  • step 801b the AF sends a Nnef_MLModelSplit_Request to the NEF, and the request includes the information in Table 1 above.
  • Step 801c after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, and the request includes the information in Table 1 above.
  • Step 802 NWDAF calls Nnf_EventExposure_Subscribe (that is, event open subscription) to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), (if any) Infer the privacy level of the data set required by this model, and (if provided) the delay requirements of different layers of the model; 5GC NF(s) (for example, AMF/SMF) send these request information to the terminal, and the terminal receives this request , (if the terminal passes this request), it prepares the power, memory, computing power corresponding to the Analytics ID (Model ID), (if any) the privacy level of the data set and other information feedback (see Table 2) to NF (for example, AMF /SMF).
  • 5GC NF(s) for example, AMF/SMF
  • Step 803 5GC NF(s) invokes Nnf_EventExposure_Notify (event exposure notification) to feed back required data to NWDAF.
  • Step 804a send information such as model split point (or model split ratio) (as shown in Table 3) to AF through Nnwdaf_MLModelSplit_Request Response.
  • model split point or model split ratio
  • Step 804b NWDAF sends Nnwdaf_MLModelSplit_Request Response to NEF, the content includes the above information of model split point (or model split ratio) and so on (see Table 3).
  • Step 804c after NEF authorizes, send Nnef_MLModelSplit_Request response to NWDAF, the content includes information such as the above-mentioned model split point (or model split ratio) (see Table 3). )
  • Step 805 AF judges the model segmentation point based on the received information and its own model and capability.
  • steps 806-808 the AF establishes a connection with the relevant UE(s) and responds to the model joint reasoning request as shown in steps 306-308 in the embodiment of scenario 11 (ie:
  • step 806 the AF establishes a connection with the relevant UE(s).
  • step 807 the AF sends a model joint inference request Naf_MLModelJointInference_Request to the relevant UE(s), and the request includes model segmentation point (or model segmentation ratio) information.
  • Step 808 UE(s) sends a model joint inference request response Naf_MLModelJointInference_Request response to the AF, indicating whether to accept the model joint inference request. ).
  • the AF requests the new network entity MMF to perform model segmentation.
  • the MMF requests the NWDAF to collect the capabilities of the UE.
  • the MMF judges the model segmentation point and sends The AF feeds back the result, and the AF initiates a joint reasoning request to the relevant UE.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • Step e11 receiving the twelfth request sent by the application function AF directly or through the network capability exposure function NEF, the twelfth request is used to request AI/ML model segmentation analysis;
  • the parameters carried in the twelfth request include At least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or any UEs that meet the analysis conditions, AI/ML model The segmented area, the size of the AI/ML model, the computing power available to the UE(s) to participate in model joint reasoning, the memory available to the UE(s) to participate in model joint reasoning, the UE to participate in model joint reasoning (s) remaining battery power; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the twelfth request also include: the parameters required for reasoning the AI/ML model Information about the privacy level of the dataset and
  • Step e12 Send a thirteenth request to NWDAF according to the parameters carried in the twelfth request, the parameters carried in the thirteenth request include the parameters carried in the twelfth request, and the 10th request Three requests are used to request NWDAF to collect UE(s) data for analyzing AI/ML model segmentation from 5GC NF(s);
  • Step e13 Receive the thirteenth request result sent by NWDAF; wherein, the thirteenth request result includes at least one of the following: UE(s) or SUPI to participate in AI/ML model segmentation, using AI/ML model
  • the identification of the application the computing power that can be provided by the UE(s) to participate in the model joint reasoning, the memory that can be provided by the UE(s) to participate in the model joint reasoning, the remaining power of the UE(s) to participate in the model joint reasoning, The size of the AI/ML model; if the parameters carried in the thirteenth request include the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model, then the result of the twelfth request It also includes: the privacy level of the data set required for reasoning about the AI/ML model and/or the delay requirement information of different layers of the model;
  • Step e14 using the thirteenth request result as the first message.
  • the method further includes:
  • the eighth message includes the AI/ML model segmentation point and the data in the first message; the eighth message is used to provide the AF with a joint
  • Nmmf_MLModelSplit_Request i.e. machine learning model segmentation request
  • Steps 902-905 are as described in steps 301c, 302, 303, and 305b in the embodiment of Scenario 11 (ie:
  • Step 902 after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, and the request includes the information in Table 1 above.
  • Step 903 NWDAF calls Nnf_EventExposure_Subscribe (event open subscription) to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), (if any) Infer the privacy level of the data set required by this model, and (if provided) the delay requirements of different layers of the model; 5GC NF(s) (for example, AMF/SMF) send these request information to the terminal, and the terminal receives this request , (if the terminal passes this request), it prepares the power, memory, computing power corresponding to the Analytics ID (Model ID), (if any) the privacy level of the data set and other information feedback (see Table 2) to NF (for example, AMF /SMF).
  • 5GC NF(s) for example, AMF/SMF
  • Step 906 MMF analyzes output data based on NWDAF (see Table 3), that is, based on UE's available computing power, available memory, remaining power, and the privacy level of the data set required for reasoning this model, and According to the delay requirements of different layers of the model, select the model segmentation point (or model segmentation ratio).
  • NWDAF see Table 3
  • Step 907 MMF sends information such as model split point (model split ratio) to AF through Nmmf_MLModelSplit_Request response (response to machine learning model split request)
  • Steps 908-9010 are as described in steps 306-308 in the embodiment of Scenario 11 (ie:
  • step 908 the AF establishes a connection with the relevant UE(s).
  • step 909 the AF sends a model joint inference request Naf_MLModelJointInference_Request to the relevant UE(s), and the request includes model segmentation point (or model segmentation ratio) information.
  • Step 9010 UE(s) sends the response Naf_MLModelJointInference_Request response to the AF, indicating whether to accept the model joint inference request).
  • Scenario 32 When UE and AF jointly infer models, UE reports its own capabilities to AF to request model joint inference, AF passes its own and UE capabilities to MMF, requests MMF to perform model segmentation, and MMF judges if other UE(s) are required to participate For model joint reasoning, the MMF requests the NWDAF to collect information about other UE(s), and the MMF performs model segmentation based on the collected information and feeds back the results to the AF.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • Step e21 receiving the fifteenth request sent by the application function AF directly or through the network capability exposure function NEF, the fifteenth request is used to request AI/ML model segmentation analysis, and the parameters carried in the fifteenth request include At least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the computing power available to the UE(s) to participate in model joint reasoning, and the memory available to the UE(s) to participate in model joint reasoning , the remaining power of the UE(s) to participate in the model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the fifteenth request also include : The privacy level of the data set required for reasoning about the AI/ML model and/or the delay requirement information of different layers of the model;
  • Step e23 if the UE(s) to participate in model joint reasoning has the ability to fully support model joint reasoning, then use the parameters carried in the fifteenth request as the first message;
  • Step e24 if the UE(s) to participate in model joint reasoning does not have the ability to fully support model joint reasoning, send a sixteenth request to NWDAF, and the parameters carried in the sixteenth request include the The parameters carried in the fifteenth request, and the sixteenth request is used to request NWDAF to find other UE(s) that can participate in model reasoning and the computing power that can be provided by other UE(s) that can participate in model reasoning through NF, Available memory, remaining power, if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the parameters carried in the sixteenth request also include: other models that can participate in reasoning The privacy level of the data set required by the UE(s) inference AI/ML model and/or the delay requirement information of different layers of the model;
  • Step e25 Receive the sixteenth request result sent by NWDAF, and use the sixteenth request result and the parameters carried in the fifteenth request as the first message; wherein, the sixteenth request result includes other possible Computing power, available memory, and remaining power available to UE(s) participating in model reasoning and other UE(s) that can participate in model reasoning, if the AI/ML model requires data privacy level and/or The time delay of different layers of the model, the sixteenth request result also includes: other UE(s) that can participate in model reasoning to infer the privacy level of the data set required by the AI/ML model and/or the time delay requirement information of different layers of the model .
  • the sixteenth request result is determined by the NWDAF by collecting the fourth data corresponding to the UE(s) from the 5GC NF(s); the fourth data includes at least one of the following: to-be-participated AI/ UE(s) for ML model segmentation or SUPI, UE(s) to participate in model joint reasoning (here, UE(s) to participate in model joint reasoning is the UE(s) mentioned above that can participate in model reasoning) can be The computing power provided, the UE(s) to participate in model joint reasoning (here, the UE(s) to participate in model joint reasoning refers to the other UE(s) that can participate in model reasoning mentioned above), the memory that can be provided, and the model to participate in The remaining power of the UE(s) for joint reasoning (the UE(s) to be involved in model joint reasoning here is the other UE(s) that can participate in model reasoning) and the size of the AI/ML model; if the reasoning AI/ML The privacy level of the data set and/or the
  • the method further includes:
  • the fifteenth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE(s) to be involved in model joint reasoning.
  • FIG. 10 is a schematic flowchart of a second signaling flow of a method for assisting model segmentation when the network entity is an MMF according to Embodiment 3 of the present disclosure. The specific steps are:
  • Steps 1000-1001 are described as steps 500-501 in the embodiment of Scenario 13 (ie:
  • step 1000 the AF establishes a connection with the UE.
  • Analytics ID MLModelSplit associated with Model ID, UE's own available computing power, available memory, remaining power and other information; if any
  • the privacy level of the data set is required for AI/ML model reasoning, and the request also includes the privacy level of the data set required for AI/ML model reasoning. If not, check whether it can be set
  • Step 1002 AF sends a model split request Nmmf_MLModelSplit_Request to MMF, the information contained in the request is as described in step 503a in the embodiment of scenario 13 (ie:
  • step 1003 the MMF judges the joint reasoning request, and if it needs other UE(s) to participate in the model joint reasoning, execute steps 1004-1007, otherwise skip.
  • Steps 1004-1007 are as described in steps 503c, 504, 505, and 507b in the embodiment of Scenario 13 (ie:
  • Step 1004 after NEF authorizes, send Nnwdaf_MLModelSplit_Request to NWDAF, the request includes the above information.
  • Step 1005 NWDAF calls Nnf_EventExposure_Subscribe to 5GC NF(s) (such as AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), and (if any) to infer the data set required by this model privacy level, and (if it can be provided) the delay requirements of different layers of the model; 5GC NF(s) (for example, AMF/SMF) sends these request information to the terminal, and the terminal receives this request, (if this request is passed) Feedback to the NF (for example, AMF/SMF) on the power, memory, computing power corresponding to the Analytics ID (Model ID), and (if any) the privacy level of the data set.
  • 5GC NF(s) such as AMF/SMF
  • Step 1006 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF;
  • Step 1007 NWDAF sends Nnwdaf_MLModelSplit_Request Response to NEF, the content includes information such as the above model split point (or model split ratio) etc.)
  • Step 1008 MMF performs model segmentation judgment based on the collected information
  • Step 1009 MMF sends the result to AF through Nmmf_MLModelSplit_Request response, the information contained in the response is as described in step 305 in the embodiment of Scenario 11 (ie: model segmentation point (or model segmentation ratio) and other information, as shown in Table 3 )
  • Step 1010 is as described in step 508 in the embodiment of scenario 13. (Right now:
  • Step 1010 AF sends the response Naf_MLModelJointInference_Request response of the model joint inference request to the UE, and the response includes the model segmentation point (or model segmentation ratio) and other UE(s) that can participate in model inference and its related model segmentation points (or model split ratio) information).
  • Embodiment 4 the network entity is the PCF, and the PCF judges the model segmentation point
  • the method for assisting model segmentation will be described in detail below by taking at least two scenarios as examples.
  • the AF requests the PCF to judge the model splitting strategy, and the PCF requests the NWDAF to collect the capabilities of the UE. Based on the received analysis results, the PCF judges the model splitting point and reports to the AF After feedback results, the AF initiates a joint reasoning request to the relevant UE.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis includes:
  • Step f11 receiving the seventeenth request sent by the application function AF directly or through the network capability exposure function NEF, the seventeenth request is used to request AI/ML model segmentation analysis;
  • the parameters carried in the seventeenth request include At least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or any UEs that meet the analysis conditions, AI/ML model The segmented area, the size of the AI/ML model, the computing power available to the UE(s) to participate in model joint reasoning, the memory available to the UE(s) to participate in model joint reasoning, the UE to participate in model joint reasoning (s) remaining battery power; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the seventeenth request also include: the parameters required for reasoning the AI/ML model Information about the privacy level of the dataset and/or the latency requirements of different layers of the model
  • Step f12 Send an eighteenth request to the network data analysis function NWDAF according to the parameters carried in the seventeenth request, where the parameters carried in the eighteenth request include the parameters carried in the seventeenth request, and The eighteenth request is used to request the NWDAF to collect the fifth data corresponding to the UE(s) from the 5GC NF(s);
  • the fifth data includes at least one of the following items: the UE to participate in AI/ML model segmentation (s) or SUPI, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning , the size of the AI/ML model; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the fifth data also includes: the data set required for reasoning the AI/ML model Information on privacy levels and/or latency requirements for different layers of the model;
  • Step f13 receiving the eighteenth request result sent by NWDAF; wherein, the eighteenth request result includes the fifth data;
  • the method further includes:
  • FIG. 11 is a schematic flowchart of a first signaling flow of a method for assisting model segmentation when the network entity is a PCF provided by Embodiment 4 of the present disclosure. The specific steps are:
  • Npcf_MLModelSplit_Request i.e. machine learning model segmentation request
  • Steps 1102-1105 are as described in steps 301c, 302, 303, and 305b in the embodiment of Scenario 11 (ie:
  • Step 1102 after NEF is authorized, send Nnwdaf_MLModelSplit_Request to NWDAF, and the request includes the information in the above Table 1.
  • NWDAF calls Nnf_EventExposure_Subscribe (that is, event open subscription) to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), (if any) Infer the privacy level of the data set required by this model, and (if provided) the delay requirements of different layers of the model; 5GC NF(s) (for example, AMF/SMF) send these request information to the terminal, and the terminal receives this request , (if the terminal passes this request), it prepares the power, memory, computing power corresponding to the Analytics ID (Model ID), (if any) the privacy level of the data set and other information feedback (see Table 2) to NF (for example, AMF /SMF).
  • 5GC NF(s) for example, AMF/SMF
  • Step 1104 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF.
  • the PCF analyzes the output data based on the NWDAF, that is, based on the UE's available computing power, available memory, remaining power, and the privacy level of the data set required for inference of this model, as well as the time of different layers of the model. According to the requirements, select the model split point (or model split ratio).
  • the computing power and memory available to the UE when the computing power and memory available to the UE are low, and the remaining power is low, it can only do a few layers of reasoning.
  • the AI/ML model requires the privacy level of the data set and/or the different layers of the model Latency, and the privacy level requirements are high and/or the delay requirements of different layers of the model are low, and fewer layers of reasoning can be done; if the UE's available computing power and memory are sufficient, and the remaining power is high, you can do More layers of reasoning, if the privacy level of the dataset and/or the delay of different layers of the model are required when inferring the AI/ML model, and the privacy level requirements are low and/or the delay requirements of different layers of the model are high, compare Multi-layer reasoning; for example, suppose that for an 8-layer AI/ML model, UE1 can do two-layer reasoning (for example, UE1 does the reasoning of the first two layers), and UE2 can do three-layer reasoning (for example, UE2 does the middle Three-layer reasoning
  • Step 1107 PCF sends information such as model split point (model split ratio) to AF through Npcf_MLModelSplit_Request response (that is, the response to the machine learning model split request).
  • model split point model split ratio
  • Npcf_MLModelSplit_Request response that is, the response to the machine learning model split request.
  • Step 1108 the AF establishes a connection with the relevant UE(s).
  • Scenario 42 When UE and AF jointly infer models, UE reports its own capabilities to AF to request model joint inference, AF sends itself and UE capabilities to PCF, requests PCF to perform model segmentation, and PCF judges if other UE(s) are required to participate For model joint reasoning, the PCF requests the NWDAF to collect information about other UE(s), and the PCF performs model segmentation based on the collected information and feeds back the results to the AF.
  • the receiving the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis may include the following steps:
  • Step f21 Receive the twentieth request sent by the application function AF directly or through the network capability exposure function NEF, the twentieth request is used to request AI/ML model segmentation analysis, and the parameters carried in the twentieth request include At least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the computing power available to the UE(s) to participate in model joint reasoning, and the memory available to the UE(s) to participate in model joint reasoning 1.
  • Step f22 determine whether the AF and the UE to participate in model joint reasoning have the ability to fully support model joint reasoning;
  • Step f23 if the AF and the UE(s) to participate in model joint reasoning have the ability to fully support the execution of model joint reasoning, then use the parameters carried in the twentieth request as the first message;
  • Step f24 if the AF and the UE(s) to participate in model joint reasoning do not have the ability to fully support the execution of model joint reasoning, send a twenty-first request to NWDAF, and the parameters carried in the twenty-first request Including the parameters carried in the twentieth request, and the twenty-first request is used to request NWDAF to find other UE(s) that can participate in model inference through NF and other UE(s) that can participate in model inference can provide computing power, available memory, and remaining power, if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the parameters carried in the twenty-first request also include: Other UE(s) that can participate in model reasoning can reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • Step f25 Receive the twenty-first request result sent by NWDAF, and use the twenty-first request result and the parameters carried in the twentieth request as the first message; wherein, the twenty-first The request results include: other UE(s) that can participate in model inference and the computing power, available memory, and remaining power available to other UE(s) that can participate in model inference.
  • the twenty-first request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level and/or model of the data set required by the AI/ML model Delay requirement information of different layers; (the twenty-first request result is determined by NWDAF by collecting sixth data corresponding to UE(s) from 5GC NF(s); the sixth data includes at least the following One item: UE(s) to participate in AI/ML model segmentation or SUPI, UE(s) to participate in model joint reasoning (the UE(s) to participate in model joint reasoning here refers to other model reasoning that can participate in the above
  • the method further includes:
  • the twentieth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE.
  • FIG. 12 is a schematic diagram of a second signaling flow of a method for assisting model segmentation when the network entity is a PCF provided by Embodiment 4 of the present disclosure. The specific steps are:
  • Steps 1200-1201 are described as steps 500-501 in the embodiment of Scenario 13 (ie:
  • Step 1200 the AF establishes a connection with the UE
  • Analytics ID MLModelSplit associated with Model ID, UE's own available computing power, available memory, remaining power and other information; if any
  • the privacy level of the data set is required for AI/ML model reasoning, and the request also includes the privacy level of the data set required for AI/ML model reasoning. If not, check whether it can be set
  • Step 1203 based on the received information, such as self-capability reported by UE and AF, PCF judges the joint reasoning request, and if other UE(s) need to participate in model joint reasoning, execute steps 1204-1207, otherwise skip.
  • Steps 1204-1207 are as described in steps 503c, 504, 505, and 507b in the embodiment of Scenario 13 (ie:
  • Step 1205 NWDAF calls Nnf_EventExposure_Subscribe to 5GC NF(s) (for example, AMF/SMF) to collect the available computing power, available memory, and remaining power of UE(s), and (if any) to infer the data required by this model
  • 5GC NF(s) for example, AMF/SMF
  • the 5GC NF(s) sends these request information to the terminal, and the terminal receives this request, (if passed this request ) to feed back information such as power, memory, computing power corresponding to the Analytics ID (Model ID), and (if any) the privacy level of the data set to the NF (for example, AMF/SMF).
  • Step 1206 5GC NF(s) calls Nnf_EventExposure_Notify to feed back required data to NWDAF.
  • step 1208 the PCF makes a model segmentation judgment based on the collected information.
  • Step 1209 PCF sends the result to AF through Nmmf_MLModelSplit_Request response, the information contained in the response is as described in step 305 in the embodiment of Scenario 11 (ie: model segmentation point (or model segmentation ratio) and other information, as shown in Table 3 .)
  • Step 1210 is as described in step 508 in the embodiment of scenario 13. (Right now:
  • Step 1210 AF sends the response Naf_MLModelJointInference_Request response of the model joint inference request to the UE, and the response includes the model segmentation point (or model segmentation ratio) and other UE(s) that can participate in the model inference and its related model segmentation points (or model split ratio) information).
  • FIG. 13 is a second schematic flow chart of the method for assisting model segmentation provided by an embodiment of the present disclosure.
  • the execution subject of the method for assisting model segmentation provided by this embodiment is a user equipment (or terminal) UE, Then the method for assisting model segmentation provided by the embodiment of the present disclosure includes the following steps:
  • determining the AI/ML model segmentation point information according to its own capability information may include the following steps:
  • the terminal can select a model based on information such as its own available power, available memory, and available computing power, as well as the delay requirements of this model and the privacy level of the inference data set required by this model. Segmentation point (that is, model segmentation point information or AI/ML model segmentation point information), and then the terminal sends the model segmentation point information to the network entity or terminal participating in the model joint reasoning.
  • Segmentation point that is, model segmentation point information or AI/ML model segmentation point information
  • determining the model segmentation point information may include the following steps:
  • Step h2 if the computing power available to the target UE is higher than the first preset computing power threshold and lower than the second preset computing power threshold, and the memory available to the target UE is higher than the first preset memory threshold and is lower than the second preset memory threshold, and when the remaining power of the target UE is higher than the first preset power threshold and lower than the second preset power threshold, it is determined that the target UE executes the second preset number of reasoning, the first preset number of layers is less than the second preset number of layers;
  • Step h4 if the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is lower than the first preset privacy level or model
  • the delay requirement information of different layers is lower than a first preset delay, then the target UE performs reasoning on the first preset number of layers;
  • Step h6 if the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than At the third preset privacy level, the delay requirement information of different layers of the model is higher than the second preset delay and lower than the third preset delay, and it is determined that the target UE performs reasoning of a third preset number of layers ; and so on, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • Step h7 Determine the AI/ML model segmentation point information according to the preset number of layers, the AI/ML model segmentation point information is used to indicate the AI/ML model segmentation ratio, and the preset number of layers includes the Nth preset Number of layers, N greater than or equal to one.
  • each UE that participates in model joint reasoning needs to determine the number of layers or parts of its reasoning, and for each UE (ie, the target UE), the determination of the segmentation point can be realized through the following process:
  • UE1 can do two-layer reasoning (for example, UE1 does reasoning for the first two layers), and UE2 can do three-layer reasoning (for example, UE2 does reasoning for the middle three layers)
  • AF does the reasoning of the last three layers, that is, UE1 corresponds to model segmentation point 1
  • UE2 corresponds to model segmentation point 2.
  • the model segmentation point information is determined according to the available computing power, available memory, and remaining power in the self-capability information; if the privacy level and/or model of the data set are required for inference of the AI/ML model
  • the time delay of different layers can also be combined with the privacy level of the data set required by the inference AI/ML model in its own capability information and/or the time delay requirement information of different layers of the model to determine the model segmentation point for splitting the AI/ML model Information, as the segmentation result, is used to realize the network entity or the UE(s) to be involved in the joint reasoning of the model to perform joint reasoning operations on the model based on the segmentation result. Therefore, the model segmentation based on the terminal (ie UE) capability is realized. Analysis, and then effectively realize the protection of terminal privacy and the optimization of network resources.
  • the method for assisting model segmentation will be described in detail below by taking at least two scenarios as examples.
  • Scenario 51 When the UE and the AF jointly infer models, the UE judges the model segmentation point based on its own capabilities, reports the model segmentation point (model segmentation ratio) to the AF, and performs joint inference interaction.
  • the method may further include the following steps:
  • Step i1 Send a first request to the application function AF, the first request is used to request a joint inference operation with the AF execution model; wherein, the parameters carried in the first request include at least one of the following AI/ML Model segmentation point, analysis type identification associated with model identification or model segmentation identification, identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or any UEs that meet the analysis conditions, AI/ML model segmentation The area divided, the size of the AI/ML model, the computing power that can be provided by the UE to participate in model joint reasoning, the memory that can be provided by the UE to participate in model joint reasoning, and the remaining power of the UE to participate in model joint reasoning; The privacy level of the data set and/or the time delay of different layers of the model are required for the /ML model, and the parameters carried in the first request also include: the privacy level of the data set and/or the different layers of the model required for reasoning the AI/ML model The delay requirement information;
  • Step i2 receiving the first request result sent by the AF, the first request result is determined by the AF according to the parameters carried in the first request, and the first request result includes accepting the first request or not accepting the first request;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the implementation process is:
  • Step 1401 step 401 in the embodiment of scenario 12 (ie:
  • the UE initiates Naf_MLModelJointInference_Request for ML model joint inference request with AF.
  • the privacy level of the data set is required for the ML model, and the request also includes the privacy level of the data set required for reasoning the AI/ML model. If not, check whether it can be set based on SA3; (if there is a model required for reasoning AI/ML models
  • the request also includes the delay requirement information of different layers of the model, if not, it will be provided by AF.
  • the request also adds model segmentation point (model segmentation ratio) information to report to AF.
  • Step 1402 as shown in steps 806-808 in the embodiment of scenario 22, UE and AF establish a joint model reasoning.
  • the AF sends a model joint inference request Naf_MLModelJointInference_Request to the relevant UE(s), and the request includes the model segmentation point (or model segmentation ratio) information.
  • the UE(s) sends the response Naf_MLModelJointInference_Request response to the AF to indicate whether to accept the model joint inference request. )
  • Scenario 52 When UE and NWDAF jointly infer models, UE judges model segmentation points based on its own capabilities, reports to NWDAF, and performs joint inference interaction.
  • the method may further include the following steps:
  • Step j2 Receive the second request result sent by NWDAF, the second request result is determined by NWDAF according to the parameters carried in the first request, and the second request result includes accepting the first request or not accepting the first request;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the implementation process is:
  • Step 1501 step 401 in the embodiment of scenario 12 (ie:
  • NEF sends Nnef_MLModelSplit_Request to NEF, and the request contains the above information.
  • NEF authorizes, it sends Nnwdaf_MLModelSplit_Request to NWDAF, and the request contains the above information) and sends the information to NWDAF).
  • Step 1503 NWDAF and UE establish joint model reasoning.
  • the UE sends a model joint inference request and the parameters included in the request; the UE receives a model joint inference response obtained from the NWDAF/AF and the parameters included in the response. Therefore, the terminal selects the model segmentation point based on information such as the available power, available memory, and available computing power, as well as the delay requirements of this model and the privacy level of the inference data set required by this model. , the terminal sends the model segmentation point information to the network entity or terminal participating in the model joint reasoning. Realize model segmentation reasoning based on terminal capabilities, which is beneficial to protect terminal privacy and optimize network resources.
  • FIG. 14 is a schematic structural diagram of an apparatus for assisting model segmentation provided by an embodiment of the present disclosure. As shown in FIG. 14 , the apparatus for assisting model segmentation provided by this embodiment is applied to a network entity. Then, the apparatus for assisting model segmentation provided in this embodiment includes: a transceiver 1400 configured to receive and send data under the control of a processor 1410 .
  • the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by the processor 1410 and various circuits of the memory represented by the memory 1420 are linked together.
  • the bus architecture can also link together various other circuits such as peripherals, voltage regulators, and power management circuits, etc., which are well known in the art and therefore will not be further described herein.
  • the bus interface provides the interface.
  • the transceiver 1400 may be a plurality of elements, including a transmitter and a receiver, providing a unit for communicating with various other devices over transmission media, including wireless channels, wired channels, optical cables, and other transmission media.
  • the processor 1410 is responsible for managing the bus architecture and general processing, and the memory 1420 can store data used by the processor 140 when performing operations.
  • the processor 1410 may be a central processing device (CPU), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field-Programmable Gate Array, FPGA) or a complex programmable logic device (Comple6Programmable Logic Device, CPLD), the processor can also adopt a multi-core architecture.
  • CPU central processing device
  • ASIC Application Specific Integrated Circuit
  • FPGA field programmable gate array
  • CPLD complex programmable logic device
  • the memory 1420 is used to store computer programs; the transceiver 1400 is used to send and receive data under the control of the processor 1410; the processor 1410 is used to read the computer programs in the memory and perform the following operations:
  • the first message includes at least one of the following items: the UE(s) to participate in AI/ML model segmentation or the user permanent identification SUPI, the identification of the application using the AI/ML model, the model union to participate Computing power provided by UE(s) for reasoning, memory available for UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, size of AI/ML model;
  • the first message also includes: the privacy level of the data set and/or different layers of the model required for inference of the AI/ML model The delay requirement information.
  • processor 1410 when the processor 1410 is configured to determine the segmentation result of the AI/ML model segmentation according to the first message, it specifically includes:
  • the memory available to UE(s) to participate in model joint reasoning, and the remaining UE(s) to participate in model joint reasoning Determine the model segmentation point information corresponding to the UE(s) to be involved in model joint reasoning;
  • the to-be-participated The memory that can be provided by UE(s) for model joint reasoning, the remaining power of UE(s) to participate in model joint reasoning, the privacy level of the data set required for reasoning AI/ML models, and/or the delay requirement information of different layers of the model , determining the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning;
  • the processor 1410 when configured to determine the model segmentation point information corresponding to the UE(s) to participate in model joint inference, it specifically includes:
  • the computing power that the target UE can provide is lower than the first preset computing power threshold or the memory that the target UE can provide is lower than the first preset memory threshold or the remaining power of the target UE is lower than the first preset
  • the battery power threshold it is determined that the target UE performs a first preset number of layers of reasoning
  • the computing power available to the target UE is higher than the second preset computing power threshold and lower than the third preset computing power threshold, and the memory available to the target UE is higher than the second preset memory threshold and lower than
  • the third preset memory threshold and the remaining power of the target UE are higher than the second preset power threshold and lower than the third preset power threshold
  • the target UE performs reasoning on the first preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the first preset privacy level and lower than the second
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the first preset delay and lower than the second preset delay, and determine that the target UE performs reasoning of the second preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than the third
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the second preset delay and lower than the third preset delay, and determine that the target UE performs reasoning of the third preset number of layers; thereby By analogy, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • the AI/ML model segmentation point information is used to represent the AI/ML model segmentation ratio
  • the preset number of layers includes the Nth preset number of layers, N is greater than or equal to one.
  • the processor 1410 is configured to specifically include: when the network entity is a network data analysis function NWDAF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis
  • the parameters carried in the first request include the following At least one item: the analysis type identification associated with model segmentation, the identification of a user equipment UE or a group of UEs that accept AI/ML model segmentation or any UEs that meet the segmentation conditions, the area for AI/ML model segmentation, The size of the AI/ML model, the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, and the remaining resources of UE(s) to participate in model joint reasoning Electricity; if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the parameters carried in the first request also include: the privacy level and /or delay requirement information of different layers of the model;
  • the parameters carried in the first request send a second message to the 5GC NF(s), the second message is used to request the 5GC NF(s) to collect first data corresponding to the UE(s), and the first
  • the data includes at least one of the following items: UE(s) to participate in AI/ML model segmentation and SUPI, computing power available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning ( s) The available memory, the remaining power of the UE(s) to participate in the joint reasoning of the model, and the size of the AI/ML model; if the parameters carried in the first request include the privacy level and /or the delay requirement information of different layers of the model, the first data also includes: the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • processor 1410 is also used for:
  • the third message After determining the segmentation result of the AI/ML model segmentation, send a third message to the AF, the third message includes AI/ML model segmentation point information and the first data; the third message uses To provide the AF with the parameters carried when sending the second request to the UE(s) to participate in the model joint reasoning, the second request is used to request the UE(s) to perform the model joint reasoning operation, so The parameters carried in the second request include the third message.
  • the processor 1410 is configured to specifically include: when the network entity is a network data analysis function NWDAF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis:
  • processor 1410 is also used for:
  • the fourth message After determining the segmentation result of the AI/ML model segmentation, sending a fourth message to the AF, the fourth message includes the AI/ML model segmentation point and the data in the first message; the The fourth message is used to provide the AF with a fourth request result sent to the UE(s) to participate in model joint reasoning, and the fourth request result includes the AI/ML model segmentation point information.
  • the processor 1410 is configured to specifically include: when the network entity is a network data analysis function NWDAF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis:
  • the ML model is segmented for analysis and requests to find other UE(s) that can participate in model reasoning and provide the computing power, available memory, and remaining power that other UE(s) that can participate in model reasoning can provide; wherein, the The parameters carried in the fifth request include at least one of the following: the analysis type identification associated with model segmentation, the identification of a user equipment UE or a group of UEs that accept AI/ML model segmentation, or any UEs that meet the segmentation conditions , AI/ML model segmentation area, size of AI/ML model, computing
  • the delay requirement information of different layers of the model is provided by the AF.
  • processor 1410 is also used for:
  • the sixth message After determining the segmentation result of the AI/ML model segmentation, send a sixth message to the AF, the sixth message at least including the AI/ML model segmentation point information and the second data; the sixth message
  • the six messages are used to provide the AF with the sixth request result sent to the UE(s) to participate in the model joint reasoning, and the sixth request result includes the AI corresponding to the UE(s) to participate in the model joint reasoning /ML model segmentation point information, the other UE(s) that can participate in model reasoning, and the corresponding AI/ML model segmentation point information.
  • the processor 1410 is configured to specifically include: when the network entity is a network data analysis function NWDAF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis
  • the seventh request is used to request to perform a model joint reasoning operation with NWDAF;
  • the parameters carried in the seventh request include at least one of the following: The computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, and the remaining power of UE(s) to participate in model joint reasoning; if reasoning AI/ML
  • the privacy level of the data set and/or the time delay of different layers of the model are required for the model, and the parameters carried in the seventh request also include: the privacy level of the data set required for reasoning the AI/ML model and/or the time delay of different layers of the model Extending demand information; according to the parameters carried in the seventh request, determine whether the AF and the UE(s) to be involved in model joint reasoning have the ability to fully support the execution of model joint reasoning;
  • the UE(s) to be involved in model joint reasoning has the ability to fully support the execution of model joint reasoning, then use the parameters carried in the seventh request as the first message;
  • processor 1410 is also used for:
  • the seventh request result After determining the segmentation result of the AI/ML model segmentation, sending a seventh request result to the UE(s) to participate in the model joint reasoning, the seventh request result includes AI/ML model segmentation point information;
  • the seventh request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to all UE(s) through AF. Describe the UE.
  • the eighth request is used to request to perform a model joint inference operation with AF;
  • the parameters carried in the eighth request include at least one of the following: The analysis type identification associated with the model identification or model segmentation identification, the AI/ML model segmentation area, the size of the AI/ML model, the computing power that can be provided by the UE(s) to participate in the model joint reasoning, and the joint model reasoning to be participated The available memory of the UE(s) and the remaining power of the UE(s) to participate in the joint inference of the model; if the privacy level of the data set and/or the delay of different layers of the model are required when inferring the AI/ML model, the first The parameters carried in the eighth request also include: the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the UE(s) to be involved in model joint reasoning has the ability to fully support model joint reasoning, use the parameters carried in the eighth request as the first message;
  • a ninth request is sent to the network data analysis function NWDAF, and the parameters carried in the ninth request include the first The parameters carried in the eighth request, and the ninth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power and available computing power provided by other UE(s) that can participate in model reasoning Memory, remaining power, if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the ninth request also include: other UEs that can participate in model reasoning (s ) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the ninth request result sent by NWDAF, and use the ninth request result and the parameters carried in the eighth request as the first message; wherein the ninth request result includes other UEs that can participate in model reasoning (s) and other UE(s) that can participate in model inference can provide computing power, available memory, and remaining power. Delay, the ninth request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the delay requirement information of different layers of the model is provided by the AF.
  • processor 1410 is also used for:
  • the UE(s) to be involved in model joint reasoning After determining the segmentation results of the AI/ML model segmentation, it will be determined whether the UE(s) to be involved in model joint reasoning has the result of fully supporting the ability to perform model joint reasoning, and the UE(s) to be involved in model joint reasoning (s) related information and the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning are sent to the UE;
  • the relevant information includes at least one of the following items: computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, memory available to participate in model joint reasoning The remaining power of the UE(s); if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the relevant information also includes: the privacy of the data set required for inferring the AI/ML model Latency requirements information for different layers of the class and/or model.
  • the processor 1410 is configured to specifically include: when the network entity is an application function AF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis:
  • the tenth request is used to request the NWDAF to collect data corresponding to the UE(s) for analyzing AI/ML model segmentation from the 5GC NF(s), and the tenth request
  • the parameters carried in include at least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation, or any UEs that meet the analysis conditions , AI/ML model segmentation area, size of AI/ML model, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, The remaining power of the UE(s) for model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the tenth request also include: reasoning AI/ML The privacy level of the data set required by the ML model and/or the
  • the tenth request result includes at least one of the following: the UE(s) or SUPI to participate in AI/ML model segmentation, the identification of the application using the AI/ML model, the Computing power available to UE(s) for model joint reasoning, memory available for UE(s) to participate in model joint reasoning, remaining battery power of UE(s) to participate in model joint reasoning, size of AI/ML model; If the parameters carried in the tenth request include the privacy level of the data set required for reasoning the AI/ML model and/or the delay requirement information of different layers of the model, the result of the tenth request also includes: the reasoning AI/ML model needs The privacy level of the data set and/or the delay requirement information of different layers of the model;
  • processor 1410 is also used for:
  • an eleventh request is sent to the UE(s) to be involved in the joint reasoning of the model, and the eleventh request is used to request the UE(s) to be involved in the joint reasoning of the model s) Executing model joint reasoning operations; wherein, the parameters carried in the eleventh request include at least one of the following: model segmentation point information, the first message;
  • the eleventh request result includes accepting the eleventh request or not accepting the eleventh request.
  • the processor 1410 is configured to, when the network entity is a new network entity MMF and receive the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis, specifically include:
  • the thirteenth request result includes at least one of the following: the UE(s) or SUPI to participate in the AI/ML model segmentation, and the application using the AI/ML model Identification, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, AI/ML
  • the size of the model if the parameters carried in the thirteenth request include the privacy level of the dataset required for inferring the AI/ML model and/or the delay requirement information of different layers of the model, then the result of the twelfth request also includes: Reasoning the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the thirteenth request result is used as the first message.
  • processor 1410 is also used for:
  • the eighth message After determining the segmentation result of the AI/ML model segmentation, sending an eighth message to the AF, the eighth message includes the AI/ML model segmentation point and the data in the first message; the The eighth message is used to provide the AF with the parameters carried when sending the fourteenth request to the UE(s) to participate in the joint reasoning of the model, and the fourteenth request is used to request the execution of the UE(s) with the joint reasoning of the model
  • the parameters carried in the fourteenth request include the eighth message.
  • the processor 1410 is configured to, when the network entity is a new network entity MMF and receive the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis, specifically include:
  • the fifteenth request is used to request AI/ML model segmentation analysis
  • the parameters carried in the fifteenth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, the The remaining power of the UE(s) for model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the fifteenth request also include: reasoning AI /The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model;
  • the UE(s) to be involved in model joint reasoning has the ability to fully support model joint reasoning, use the parameters carried in the fifteenth request as the first message;
  • the UE(s) to participate in model joint reasoning does not have the ability to fully support model joint reasoning, send a sixteenth request to NWDAF, and the parameters carried in the sixteenth request include the fifteenth The parameters carried in the request, and the sixteenth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power that can be provided by other UE(s) that can participate in model reasoning.
  • the parameters carried in the sixteenth request also include: other UEs that can participate in model reasoning ( s) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • processor 1410 is also used for:
  • the fifteenth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE(s) to be involved in model joint reasoning.
  • the processor 1410 is configured to specifically include: when the network entity is a policy control function PCF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis:
  • the seventeenth request is used to request AI/ML model segmentation analysis;
  • the parameters carried in the seventeenth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or any UEs that meet the analysis conditions, and the identification of AI/ML model segmentation Area, size of AI/ML model, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning Remaining power; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the seventeenth request also include: the data set required for reasoning the AI/ML model Information on privacy levels and/or latency requirements for different layers of the model;
  • the fifth data includes at least one of the following items: UE(s) to participate in AI/ML model segmentation Or SUPI, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, AI/ The size of the ML model; if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the fifth data also includes: the privacy level and /or delay requirement information of different layers of the model;
  • the eighteenth request result is used as the first message.
  • processor 1410 is also used for:
  • the AF After determining the segmentation result of the AI/ML model segmentation, send a ninth message to the AF, where the ninth message includes the AI/ML model segmentation point and the data of the first message;
  • the ninth message is used to provide AF with the parameters carried when sending the nineteenth request to the UE(s) to participate in the joint reasoning of the model, and the nineteenth request is used to request the UE(s) to execute the model with the joint reasoning of the model
  • the parameters carried in the nineteenth request include the ninth message.
  • the processor 1410 is configured to specifically include: when the network entity is a policy control function PCF and receives the first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis:
  • the twentieth request is used to request AI/ML model segmentation analysis
  • the parameters carried in the twentieth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, the The remaining power of the UE(s) for model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the twentieth request also include: reasoning AI /The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model;
  • the parameters carried in the twenty-first request also include: other models that can participate in reasoning The privacy level of the data set required by the UE(s) inference AI/ML model and/or the delay requirement information of different layers of the model;
  • the 21st request result includes : Other UE(s) that can participate in model inference and the computing power, available memory, and remaining power available to other UE(s) that can participate in model inference. If AI/ML models need to be inferred, the privacy level of the data set is required and/or the time delay of different layers of the model, the twenty-first request result also includes: other UE(s) that can participate in model reasoning to infer the privacy level of the data set required by the AI/ML model and/or the different layers of the model Latency Requirement Information
  • the delay requirement information of different layers of the model is provided by the AF.
  • processor 1410 is also used for:
  • the twentieth request result After determining the segmentation result of the AI/ML model segmentation, send the twentieth request result to the AF, and the twentieth request result includes AI/ML model segmentation point information;
  • the twentieth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE.
  • the device for assisting model segmentation provided by the present disclosure can realize all the method steps realized by the method embodiments shown in Fig. 2-Fig. 12, and can achieve the same technical effect. Parts and beneficial effects in the embodiment that are the same as those in the method embodiment are described in detail.
  • Fig. 15 is a schematic structural diagram of a device for assisting model segmentation provided by another embodiment of the present disclosure. As shown in Fig. 15, the device for assisting model segmentation provided in this embodiment is applied to a network entity, and the assisting The device 1500 for model segmentation includes:
  • a receiving unit 1501 configured to receive a first message for assisting artificial intelligence/machine learning AI/ML model segmentation analysis
  • the determining unit 1502 is configured to determine a segmentation result of AI/ML model segmentation according to the first message.
  • the first message includes at least one of the following items: the UE(s) to participate in AI/ML model segmentation or the user permanent identification SUPI, the identification of the application using the AI/ML model, the model union to participate Computing power provided by UE(s) for reasoning, memory available for UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, size of AI/ML model;
  • the first message also includes: the privacy level of the data set and/or different layers of the model required for inference of the AI/ML model The delay requirement information.
  • the memory available to UE(s) to participate in model joint reasoning, and the remaining UE(s) to participate in model joint reasoning Determine the model segmentation point information corresponding to the UE(s) to be involved in model joint reasoning;
  • the to-be-participated The memory that can be provided by UE(s) for model joint reasoning, the remaining power of UE(s) to participate in model joint reasoning, the privacy level of the data set required for reasoning AI/ML models, and/or the delay requirement information of different layers of the model , determining the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning;
  • the computing power that the target UE can provide is lower than the first preset computing power threshold or the memory that the target UE can provide is lower than the first preset memory threshold or the remaining power of the target UE is lower than the first preset
  • the battery power threshold it is determined that the target UE performs a first preset number of layers of reasoning
  • the computing power that the target UE can provide is higher than the first preset computing power threshold and lower than the second preset computing power threshold
  • the memory that the target UE can provide is higher than the first preset memory threshold and lower than A second preset memory threshold
  • the remaining power of the target UE is higher than the first preset power threshold and lower than the second preset power threshold, determining that the target UE performs a second preset number of inferences, the first preset number of layers is smaller than the second preset number of layers
  • the computing power available to the target UE is higher than the second preset computing power threshold and lower than the third preset computing power threshold, and the memory available to the target UE is higher than the second preset memory threshold and lower than
  • the third preset memory threshold and the remaining power of the target UE are higher than the second preset power threshold and lower than the third preset power threshold
  • the target UE performs reasoning on the first preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the first preset privacy level and lower than the second
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the first preset delay and lower than the second preset delay, and determine that the target UE performs reasoning of the second preset number of layers
  • the AI/ML model segmentation point information is used to represent the AI/ML model segmentation ratio
  • the preset number of layers includes the Nth preset number of layers, N is greater than or equal to one.
  • the receiving unit is specifically used for:
  • the network entity is a network data analysis function NWDAF
  • the parameters carried in the first request include at least one of the following: an analysis type identifier associated with model segmentation, an identifier of a user equipment UE or a group of UEs that accept AI/ML model segmentation, or an UE that meets the segmentation conditions.
  • the parameters carried in the first request also include: reasoning The privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the parameters carried in the first request send a second message to the 5GC NF(s), the second message is used to request the 5GC NF(s) to collect first data corresponding to the UE(s), and the first
  • the data includes at least one of the following items: UE(s) to participate in AI/ML model segmentation and SUPI, computing power available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning ( s) The available memory, the remaining power of the UE(s) to participate in the joint reasoning of the model, and the size of the AI/ML model; if the parameters carried in the first request include the privacy level and /or the delay requirement information of different layers of the model, the first data also includes: the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the device for assisting model segmentation further includes: a processing unit; the processing unit is used for:
  • the third message After determining the segmentation result of the AI/ML model segmentation, send a third message to the AF, the third message includes AI/ML model segmentation point information and the first data; the third message uses To provide the AF with the parameters carried when sending the second request to the UE(s) to participate in the model joint reasoning, the second request is used to request the UE(s) to perform the model joint reasoning operation, so The parameters carried in the second request include the third message.
  • the receiving unit is specifically used for:
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation further includes: a processing unit; the processing unit is used for:
  • the receiving unit is specifically used for:
  • the fifth message is used to request the 5GC NF(s) to collect second data corresponding to the UE(s), and the second The data includes at least one of the following items: UE(s) or SUPI to participate in AI/ML model segmentation, computing power available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning ) available memory, the remaining power of the UE(s) to participate in the joint reasoning of the model, and the size of the AI/ML model; if the parameters carried in the fifth request include the privacy level of the data set required for reasoning the AI/ML model and/or or the delay requirement information of different layers of the model, the second data also includes: the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the second data is agreed to by the UE(s) to participate in model joint reasoning Messages provided after the request;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the sixth message After determining the segmentation result of the AI/ML model segmentation, send a sixth message to the AF, the sixth message at least including the AI/ML model segmentation point information and the second data; the sixth message
  • the six messages are used to provide the AF with the sixth request result sent to the UE(s) to participate in the model joint reasoning, and the sixth request result includes the AI corresponding to the UE(s) to participate in the model joint reasoning /ML model segmentation point information, the other UE(s) that can participate in model reasoning, and the corresponding AI/ML model segmentation point information.
  • the seventh request is used to request to perform a model joint reasoning operation with NWDAF; the seventh request
  • the parameters carried in it include at least one of the following: computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, UE(s) to participate in model joint reasoning (s) remaining battery power; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the seventh request also include: data required for reasoning the AI/ML model The privacy level of the set and/or the delay requirement information of different layers of the model; according to the parameters carried in the seventh request, determine whether the AF and the UE(s) to be involved in the joint reasoning of the model fully support the joint reasoning of the model Ability;
  • request NF to find other UE(s) that can participate in model reasoning and provide other UE(s) that can participate in model reasoning can provide computing power, available memory, remaining power, and if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, then request to provide other UEs that can participate in model reasoning (s) The privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the search result includes other UE(s) that can participate in model reasoning and the computing power, available memory, and remaining power available to other UE(s) that can participate in model reasoning, if reasoning If the AI/ML model requires the privacy level of the dataset and/or the time delay of different layers of the model, the search results also include: other UE(s) that can participate in model reasoning to infer the privacy level and /or delay requirement information of different layers of the model;
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the seventh request result After determining the segmentation result of the AI/ML model segmentation, sending a seventh request result to the UE(s) to participate in the model joint reasoning, the seventh request result includes AI/ML model segmentation point information;
  • the seventh request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to all UE(s) through AF. Describe the UE.
  • the receiving unit is specifically used for:
  • the eighth request is used to request the AF to perform the model joint reasoning operation; the eighth request carries
  • the parameters include at least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the AI/ML model segmentation area, the size of the AI/ML model, and the available UE(s) to participate in model joint reasoning.
  • the UE(s) to be involved in model joint reasoning has the ability to fully support model joint reasoning, use the parameters carried in the eighth request as the first message;
  • a ninth request is sent to the network data analysis function NWDAF, and the parameters carried in the ninth request include the first The parameters carried in the eighth request, and the ninth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power and available computing power provided by other UE(s) that can participate in model reasoning Memory, remaining power, if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the ninth request also include: other UEs that can participate in model reasoning (s ) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the ninth request result sent by NWDAF, and use the ninth request result and the parameters carried in the eighth request as the first message; wherein the ninth request result includes other UEs that can participate in model reasoning (s) and other UE(s) that can participate in model inference can provide computing power, available memory, and remaining power. Delay, the ninth request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the UE(s) to be involved in model joint reasoning After determining the segmentation results of the AI/ML model segmentation, it will be determined whether the UE(s) to be involved in model joint reasoning has the result of fully supporting the ability to perform model joint reasoning, and the UE(s) to be involved in model joint reasoning (s) related information and the model segmentation point information corresponding to the UE(s) to participate in model joint reasoning are sent to the UE;
  • the relevant information includes at least one of the following items: computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, memory available to participate in model joint reasoning The remaining power of the UE(s); if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the relevant information also includes: the privacy of the data set required for inferring the AI/ML model Latency requirements information for different layers of the class and/or model.
  • the receiving unit is specifically used for:
  • the parameters carried in the tenth request include at least one of the following: an analysis type identifier associated with a model identifier or a model segmentation identifier, a user equipment UE or a group of user equipment receiving AI/ML model segmentation
  • an analysis type identifier associated with a model identifier or a model segmentation identifier
  • a user equipment UE or a group of user equipment receiving AI/ML model segmentation The identification of UEs or any UEs that meet the analysis conditions, the AI/ML model segmentation area, the size of the AI/ML model, the computing power that can be provided by UE(s) to participate in model joint reasoning, and the UE to participate in model joint reasoning (s)
  • the available memory the remaining power of the UE(s) to participate in the joint inference of the model; if the privacy level of the data set and/or the time delay of different layers of the model are required when infer
  • the tenth request result includes at least one of the following: the UE(s) or SUPI to participate in AI/ML model segmentation, the identification of the application using the AI/ML model, the Computing power available to UE(s) for model joint reasoning, memory available for UE(s) to participate in model joint reasoning, remaining battery power of UE(s) to participate in model joint reasoning, size of AI/ML model; If the parameters carried in the tenth request include the privacy level of the data set required for reasoning the AI/ML model and/or the delay requirement information of different layers of the model, the result of the tenth request also includes: the reasoning AI/ML model needs The privacy level of the data set and/or the delay requirement information of different layers of the model;
  • the tenth request result is used as the first message.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • an eleventh request is sent to the UE(s) to be involved in the joint reasoning of the model, and the eleventh request is used to request the UE(s) to be involved in the joint reasoning of the model s) Executing model joint reasoning operations; wherein, the parameters carried in the eleventh request include at least one of the following: model segmentation point information, the first message;
  • the eleventh request result includes accepting the eleventh request or not accepting the eleventh request.
  • the receiving unit is specifically used for:
  • the network entity is a new network entity MMF, receive the twelfth request sent by the application function AF directly or through the network capability opening function NEF, and the twelfth request is used to request AI/ML model segmentation analysis;
  • the parameters carried in the twelfth request include at least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation, or the satisfaction analysis Arbitrary UEs of the condition, AI/ML model segmentation area, size of AI/ML model, computing power available to UE(s) to participate in model joint reasoning, available to UE(s) to participate in model joint reasoning Memory, the remaining power of the UE(s) to participate in the joint inference of the model; if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the parameters carried in the twelfth request are also Including: the privacy level of the data
  • the thirteenth request result includes at least one of the following: the UE(s) or SUPI to participate in the AI/ML model segmentation, and the application using the AI/ML model Identification, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, AI/ML
  • the size of the model if the parameters carried in the thirteenth request include the privacy level of the dataset required for inferring the AI/ML model and/or the delay requirement information of different layers of the model, then the result of the twelfth request also includes: Reasoning the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the thirteenth request result is used as the first message.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the eighth message After determining the segmentation result of the AI/ML model segmentation, sending an eighth message to the AF, the eighth message includes the AI/ML model segmentation point and the data in the first message; the The eighth message is used to provide the AF with the parameters carried when sending the fourteenth request to the UE(s) to participate in the joint reasoning of the model, and the fourteenth request is used to request the execution of the UE(s) with the joint reasoning of the model
  • the parameters carried in the fourteenth request include the eighth message.
  • the receiving unit is specifically used for:
  • the fifteenth request is used to request AI/ML model segmentation analysis
  • the The parameters carried in the fifteenth request include at least one of the following: the analysis type identifier associated with the model identifier or the model segmentation identifier, the computing power that can be provided by the UE(s) to participate in the model joint reasoning, and the model joint reasoning to be participated The available memory of the UE(s) and the remaining power of the UE(s) to participate in the joint inference of the model; if the privacy level of the data set and/or the delay of different layers of the model are required when inferring the AI/ML model, the first The parameters carried in the 15th request also include: the privacy level of the data set required for inferring the AI/ML model and/or the delay requirement information of different layers of the model;
  • the UE(s) to be involved in model joint reasoning has the ability to fully support model joint reasoning, use the parameters carried in the fifteenth request as the first message;
  • the UE(s) to participate in model joint reasoning does not have the ability to fully support model joint reasoning, send a sixteenth request to NWDAF, and the parameters carried in the sixteenth request include the fifteenth The parameters carried in the request, and the sixteenth request is used to request NWDAF to find other UE(s) that can participate in model reasoning through NF and the computing power that can be provided by other UE(s) that can participate in model reasoning.
  • the parameters carried in the sixteenth request also include: other UEs that can participate in model reasoning ( s) Inferring the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model;
  • the sixteenth request result includes other participating Computing power, available memory, and remaining power available to UE(s) for model reasoning and other UE(s) that can participate in model reasoning, if the privacy level of the data set and/or model is required for AI/ML model reasoning
  • the sixteenth request result also includes: other UE(s) that can participate in model reasoning to reason about the privacy level of the data set required by the AI/ML model and/or the delay requirement information of different layers of the model
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the fifteenth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE(s) to be involved in model joint reasoning.
  • the receiving unit is specifically used for:
  • the network entity is a policy control function PCF, receive the seventeenth request sent by the application function AF directly or through the network capability opening function NEF, and the seventeenth request is used to request AI/ML model segmentation analysis; the first 17.
  • the parameters carried in the request include at least one of the following: the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation, or the analysis condition is satisfied Any UEs in the AI/ML model segmentation area, the size of the AI/ML model, the computing power available to the UE(s) to participate in the model joint reasoning, and the memory available to the UE(s) to participate in the model joint reasoning 1.
  • the fifth data includes at least one of the following items: UE(s) to participate in AI/ML model segmentation Or SUPI, computing power available to UE(s) to participate in model joint reasoning, memory available to UE(s) to participate in model joint reasoning, remaining power of UE(s) to participate in model joint reasoning, AI/ The size of the ML model; if the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, the fifth data also includes: the privacy level and /or delay requirement information of different layers of the model;
  • the eighteenth request result is used as the first message.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the AF After determining the segmentation result of the AI/ML model segmentation, send a ninth message to the AF, where the ninth message includes the AI/ML model segmentation point and the data of the first message;
  • the ninth message is used to provide AF with the parameters carried when sending the nineteenth request to the UE(s) to participate in the joint reasoning of the model, and the nineteenth request is used to request the UE(s) to execute the model with the joint reasoning of the model
  • the parameters carried in the nineteenth request include the ninth message.
  • the receiving unit is specifically configured to: if the network entity is a policy control function PCF,
  • the twentieth request is used to request AI/ML model segmentation analysis
  • the parameters carried in the twentieth request include at least the following One item: the analysis type identification associated with the model identification or model segmentation identification, the computing power that can be provided by UE(s) to participate in model joint reasoning, the memory that can be provided by UE(s) to participate in model joint reasoning, the The remaining power of the UE(s) for model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the twentieth request also include: reasoning AI /The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model;
  • the AF and the UE(s) to participate in model joint reasoning do not have the ability to fully support model joint reasoning, send a 21st request to NWDAF, and the parameters carried in the 21st request include the The parameters carried in the twentieth request, and the twenty-first request is used to request NWDAF to find other UE(s) that can participate in model reasoning and the computing power that can be provided by other UE(s) that can participate in model reasoning.
  • the parameters carried in the twenty-first request also include: other models that can participate in reasoning The privacy level of the data set required by the UE(s) inference AI/ML model and/or the delay requirement information of different layers of the model;
  • the 21st request result includes : Other UE(s) that can participate in model inference and the computing power, available memory, and remaining power available to other UE(s) that can participate in model inference. If AI/ML models need to be inferred, the privacy level of the data set is required and/or the time delay of different layers of the model, the twenty-first request result also includes: other UE(s) that can participate in model reasoning to infer the privacy level of the data set required by the AI/ML model and/or the different layers of the model Latency Requirement Information
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation further includes a processing unit; the processing unit is used for:
  • the twentieth request result After determining the segmentation result of the AI/ML model segmentation, send the twentieth request result to the AF, and the twentieth request result includes AI/ML model segmentation point information;
  • the twentieth request result also includes model segmentation point information corresponding to other UE(s) participating in joint model reasoning, and transparently transmits the information to The UE.
  • the device for assisting model segmentation provided by the present disclosure can realize all the method steps realized by the method embodiments in Fig. 2-Fig. 12, and can achieve the same technical effect.
  • the same parts and beneficial effects as those in the method embodiment will be described in detail.
  • FIG. 16 is a schematic structural diagram of an apparatus for assisting model segmentation provided in yet another embodiment of the present disclosure. As shown in FIG. 16 , the apparatus for assisting model segmentation provided in this embodiment is applied to a user equipment UE. Then, the apparatus for assisting model segmentation provided in this embodiment includes: a transceiver 1600 configured to receive and send data under the control of a processor 1610 .
  • the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by the processor 1610 and various circuits of the memory represented by the memory 1620 are linked together.
  • the bus architecture can also link together various other circuits such as peripherals, voltage regulators, and power management circuits, etc., which are well known in the art and therefore will not be further described herein.
  • the bus interface provides the interface.
  • Transceiver 1600 may be a plurality of elements, including a transmitter and a receiver, providing a unit for communicating with various other devices over transmission media, including wireless channels, wired channels, optical cables, and other transmission media.
  • the processor 1610 is responsible for managing the bus architecture and general processing, and the memory 1620 can store data used by the processor 1610 when performing operations.
  • the processor 1610 may be a central processing device (CPU), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field-Programmable Gate Array, FPGA) or a complex programmable logic device (Comple8Programmable Logic Device, CPLD), the processor can also adopt a multi-core architecture.
  • CPU central processing device
  • ASIC Application Specific Integrated Circuit
  • FPGA field programmable gate array
  • CPLD complex programmable logic device
  • the memory 1620 is used to store computer programs; the transceiver 1600 is used to send and receive data under the control of the processor; the processor 1610 is used to read the computer programs in the memory and perform the following operations:
  • the AI/ML model segmentation point information is used as the segmentation result of the AI/ML model segmentation.
  • the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, then according to the available computing power, available memory, remaining power, inference AI/ML in the self-capability information, The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model determine the model segmentation point information.
  • the computing power that the target UE can provide is lower than the first preset computing power threshold or the memory that the target UE can provide is lower than the first preset memory threshold or the remaining power of the target UE is lower than the first preset
  • the battery power threshold it is determined that the target UE performs a first preset number of layers of reasoning
  • the computing power that the target UE can provide is higher than the first preset computing power threshold and lower than the second preset computing power threshold
  • the memory that the target UE can provide is higher than the first preset memory threshold and lower than A second preset memory threshold
  • the remaining power of the target UE is higher than the first preset power threshold and lower than the second preset power threshold, determining that the target UE performs a second preset number of inferences, the first preset number of layers is smaller than the second preset number of layers
  • the target UE performs reasoning on the first preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than the third
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the second preset delay and lower than the third preset delay, and determine that the target UE performs reasoning of the third preset number of layers; thereby By analogy, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • the parameters include at least one of the following AI/ML model segmentation points, the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or satisfying analysis Arbitrary UEs of the conditions, AI/ML model segmentation area, AI/ML model size, computing power available to UEs to participate in model joint reasoning, memory available to UEs to participate in model joint reasoning, and model joint reasoning The remaining power of the UE for reasoning; if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the first request also include: data required for reasoning the AI/ML model The privacy level of the set and/or the delay requirement information of different layers of the model;
  • the delay requirement information of different layers of the model is provided by the AF.
  • processor 1610 is also used for:
  • Receive a second request result sent by NWDAF the second request result is determined by NWDAF according to the parameters carried in the first request, and the second request result includes accepting the first request or not accepting the first request;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation provided by the present disclosure can realize all the method steps realized by the method embodiment shown in FIG. 13 and can achieve the same technical effect.
  • the same parts and beneficial effects as those of the method embodiment will be described in detail.
  • FIG. 17 is a schematic structural diagram of an apparatus for assisting model segmentation provided by another embodiment of the present disclosure. As shown in FIG.
  • the apparatus 1700 for auxiliary model segmentation includes:
  • a determining unit 1701 configured to determine AI/ML model segmentation point information according to its own capability information
  • the determining unit 1701 is specifically configured to:
  • the privacy level of the data set and/or the time delay of different layers of the model are required when inferring the AI/ML model, then according to the available computing power, available memory, remaining power, inference AI/ML in the self-capability information, The privacy level of the data set required by the ML model and/or the delay requirement information of different layers of the model determine the model segmentation point information.
  • the determining unit 1701 is specifically configured to:
  • the computing power that the target UE can provide is higher than the first preset computing power threshold and lower than the second preset computing power threshold
  • the memory that the target UE can provide is higher than the first preset memory threshold and lower than A second preset memory threshold
  • the remaining power of the target UE is higher than the first preset power threshold and lower than the second preset power threshold, determining that the target UE performs a second preset number of inferences, the first preset number of layers is smaller than the second preset number of layers
  • the computing power available to the target UE is higher than the second preset computing power threshold and lower than the third preset computing power threshold, and the memory available to the target UE is higher than the second preset memory threshold and lower than
  • the third preset memory threshold and the remaining power of the target UE are higher than the second preset power threshold and lower than the third preset power threshold
  • the target UE performs reasoning on the first preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the first preset privacy level and lower than the second
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the first preset delay and lower than the second preset delay, and determine that the target UE performs reasoning of the second preset number of layers
  • the privacy level of the data set and/or the time delay of different layers of the model are required for inference of the AI/ML model, and if the privacy level of the data set required for inference of the AI/ML model is higher than the second preset privacy level and lower than the third
  • the preset privacy level, and the delay requirement information of different layers of the model are higher than the second preset delay and lower than the third preset delay, and determine that the target UE performs reasoning of the third preset number of layers; thereby By analogy, until it is determined that the target UE executes the reasoning of the N+1th preset number of layers, and the Nth preset number of layers is smaller than the N+1th preset number of layers;
  • the AI/ML model segmentation point information is used to represent the AI/ML model segmentation ratio
  • the preset number of layers includes the Nth preset number of layers, N is greater than or equal to one.
  • processing unit is also used for:
  • the parameters include at least one of the following AI/ML model segmentation points, the analysis type identification associated with the model identification or model segmentation identification, the identification of a user equipment UE or a group of UEs receiving AI/ML model segmentation or satisfying analysis Arbitrary UEs of the conditions, AI/ML model segmentation area, AI/ML model size, computing power available to UEs to participate in model joint reasoning, memory available to UEs to participate in model joint reasoning, and model joint reasoning The remaining power of the UE for reasoning; if the privacy level of the data set and/or the delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the first request also include: data required for reasoning the AI/ML model The privacy level of the set and/or the delay requirement information of different layers of the model;
  • the delay requirement information of different layers of the model is provided by the AF.
  • processing unit is also used for:
  • the AF sends a second request to the network data analysis function NWDAF, and the second request is used to request to perform a model joint reasoning operation with the NWDAF; wherein, the second The parameters carried in the request include at least one of the following AI/ML model segmentation points, the analysis type identifier associated with the model identifier or the model segmentation identifier, and the user equipment UE or a group of UEs receiving the AI/ML model segmentation Identify any UEs that meet the analysis conditions, the AI/ML model segmentation area, the size of the AI/ML model, the computing power that can be provided by the UE to participate in the model joint reasoning, the memory that can be provided by the UE to participate in the model joint reasoning, The remaining power of the UE to participate in the model joint reasoning; if the privacy level of the data set and/or the time delay of different layers of the model are required when reasoning the AI/ML model, the parameters carried in the second request also include: reasoning AI/ML The
  • Receive a second request result sent by NWDAF the second request result is determined by NWDAF according to the parameters carried in the first request, and the second request result includes accepting the first request or not accepting the first request;
  • the delay requirement information of different layers of the model is provided by the AF.
  • the device for assisting model segmentation provided by the present disclosure can realize all the method steps realized by the method embodiment in FIG. 13 , and can achieve the same technical effect.
  • the same parts and beneficial effects of the embodiments are described in detail.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the embodiment of the present disclosure also provides a processor-readable storage medium.
  • the processor-readable storage medium stores a computer program, and the computer program is used to cause a processor to execute any one of the above method embodiments.
  • the processor-readable storage medium can be any available medium or data storage device that the processor can access, including but not limited to magnetic storage (such as floppy disk, hard disk, magnetic tape, magneto-optical disk (MO), etc.), optical storage (such as CD, DVD, BD, HVD, etc.), and semiconductor memory (such as ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)), etc.
  • magnetic storage such as floppy disk, hard disk, magnetic tape, magneto-optical disk (MO), etc.
  • optical storage such as CD, DVD, BD, HVD, etc.
  • semiconductor memory such as ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)
  • the embodiments of the present disclosure may be provided as methods, systems, or computer program products. Accordingly, the present disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) having computer-usable program code embodied therein.
  • processor-executable instructions may also be stored in a processor-readable memory capable of directing a computer or other programmable data processing device to operate in a specific manner, such that the instructions stored in the processor-readable memory produce a manufacturing product, the instruction device realizes the functions specified in one or more procedures of the flow chart and/or one or more blocks of the block diagram.
  • processor-executable instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented
  • the executed instructions provide steps for implementing the functions specified in the procedure or procedures of the flowchart and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本公开提供一种辅助模型切分的方法、装置及可读存储介质,该方法包括:网络实体接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、待使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;网络实体根据第一消息进行分析,确定AI/ML模型切分的切分结果。本公开能够实现基于终端能力对模型切分进行分析,进而有效地实现对终端隐私的保护以及网络资源的优化本法。

Description

辅助模型切分的方法、装置及可读存储介质
本公开要求于2021年12月23日提交中国专利局、申请号为202111608193.5、申请名称为“辅助模型切分的方法、装置及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及通信技术领域,尤其涉及一种辅助模型切分的方法、装置及可读存储介质。
背景技术
近年来,由于人工智能的技术突破,人工智能的应用越来越广泛。对于移动终端,由于其具有严格的能耗、计算和内存成本限制,无法在终端运行重量级的人工智能(英文为:Artificial Intelligence,简称为:AI)/机器学习(英文为:Machine Learning,简称为:ML)模型(以下称为AI/ML模型),因此,目前采用的手段是将许多AI/ML模型的推理从移动终端传输到云或者其他终端,即需要把AI/ML模型传输给云或者其他终端。
此外,在SA#93e通过的SA1R18需求中,需要AI/ML模型传输的场景如下:AI/ML端点之间的AI/ML模型切分,即一个AI/ML模型可以基于当前的任务或者环境切分成多个部分。趋势是将计算复杂,能耗大的部分由网络进行推理,需要隐私保护或者时延敏感的部分在终端推理。
但是,现有技术中,无法实现基于终端能力对模型切分的分析,进而无法有效地实现对终端隐私的保护以及网络资源的优化。
发明内容
本公开提供一种辅助模型切分的方法、装置及可读存储介质,解决了现有技术中无法实现基于终端能力对模型切分的分析,进而无法有效地实现对终端隐私的保护以及网络资源的优化的技术问题。
第一方面,本公开提供一种辅助模型切分的方法,应用于网络实体,所述方法包括:
接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
根据所述第一消息,确定AI/ML模型切分的切分结果。
可选的,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,所述根据所述第一消息,确定AI/ML模型切分的切分结果,包括:
根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
将所述模型切分点信息作为所述切分结果。
可选地,所述确定所述待参与模型联合推理的UE(s)对应的模型切分点信息,包括:
将所有确定参与执行模型联合推理的UE(s)中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算 力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
本公开实施例中,网络实体通过接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,并根据第一消息中的待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合 推理的UE剩余的电量来确定切分AI/ML模型的模型切分点信息,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延(其中,如果UE无法提供模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供;如果UE无法提供需要的数据集的隐私等级,基于SA3确认是否可以设定数据集的隐私等级,比如,如果SA3可以提供隐私等级设置,则可以由SA3提供数据集的隐私等级),还可以结合推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或待参与模型联合推理的UE基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第三消息,所述第三消息中包括AI/ML模型切分点信息以及所述第一数据;所述第三消息用于为AF提供向待参与模型联合推理的UE(s)发送第二请求时携带的参数,所述第二请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第二请求中携带的参数包括所述第三消息。
本公开实施例中,当UE(s)和AF(s)联合推理模型(s)时,AF请求NWDAF收集UE的能力进行模型切分点的判断,NWDAF收集并判断后将结果分别反馈给AF,AF发给相关的UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;
将所述第三请求中携带的参数作为所述第一消息;
其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第四消息,所述第四消息中包括所述AI/ML模型切分点信息、所述第一消息中的数据;所述第四消息用于为AF提供向待参与模型联合推理的UE(s)发送的第四请求结果,所述第四请求结果中包括所述AI/ML模型切分点信息。
本公开实施例中,当UE(s)和AF(s)联合推理(多个)模型时,UE上报自身的能力给AF,AF请求NWDAF进行模型切分点的判断,NWDAF把判断结果反馈给AF,AF发送给UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中 携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第六消息,所述第六消息中至少包括所述AI/ML模型切分点信息以及所述第二数据;所述第六消息用于为AF提供向待参与模型联合推理的UE(s)发送的第六请求结果,所述第六请求结果中包括所述待参与模型联合推理的UE(s)以及对应的所述AI/ML模型切分点信息。
本公开实施例中,当UE(s)和AF(s)联合推理模型时,UE上报自身的能力给AF,AF请求NWDAF进行模型切分点的判断并请求其他可参与模型推理的UE(s)参与本模型的推理,NWDAF把收集和判断结果反馈给AF,AF发送给UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推 理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第七请求中携带的参数以及所述查找结果作为所述第一消息;
其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述待参与模型联合推理的UE(s)发送第七请求结果,所述第七请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第七请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE(s)。
本公开实施例中,当UE和NWDAF联合推理模型时,UE通过AF上报自身的能力NWDAF,同时请求NWDAF进行模型切分点的判断,NWDAF把判断结果通过AF反馈给UE。
可选地,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延, 所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
将确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力的结果、所述待参与模型联合推理的UE(s)的相关信息以及所述待参与模型联合推理的UE(s)对应的模型切分点信息发送给所述UE;
其中,所述相关信息包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述相关信息中还包括:推理AI/ML模型需要的数据集 的隐私等级和/或模型不同层的时延需求信息。
本公开实施例中,当UE和AF联合推理模型时,UE上报自身的能力给AF,AF进行联合推理和模型切分点的判断,然后把判断结果反馈给UE。
可选地,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十请求结果作为所述第一消息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向待参与模型联合推理的UE(s)发送第十一请求,所述第十一请求用于请求与所述待参与模型联合推理的UE(s)执行模型联合推理操作;其中,所述第十一请求中携带的参数包括下述至少一项:模型切分点信息、所述第一消息;
接收待参与模型联合推理的UE(s)发送的第十一请求结果,所述第十一请求结果是由待参与模型联合推理的UE(s)根据第十一请求中携带的参数确定的,所述第十一请求结果包括接受第十一请求或不接受第十一请求。
本公开实施例中,当UE和AF联合推理模型时,AF请求NWDAF收集UE的能力,AF基于收到的分析结果,进行模型切分点的判断,并向相关UE发起联合推理请求。
可选地,如果所述网络实体是新的网络实体(例如,MMF,Model Management Function),所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分的数据;
接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十三请求结果作为所述第一消息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第八消息,所述第八消息中包括所述AI/ML模型切分点、 所述第一消息中的数据;所述第八消息用于为AF提供向待参与模型联合推理的UE(s)发送第十四请求时携带的参数,所述第十四请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十四请求中携带的参数包括所述第八消息。
本公开实施例中,当UE和AF联合推理模型时,AF请求新的网络实体MMF进行模型切分,MMF请求NWDAF收集UE的能力,MMF基于收到的分析结果,进行模型切分点的判断,并向AF反馈结果,AF向相关UE发起联合推理请求。
可选地,如果所述网络实体是新的网络实体MMF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及所述第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第十五请求结果,所述第十五请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第十五请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述待参与模型联合推理的UE(s)。
本公开实施例中,当UE和AF联合推理模型时,UE上报自身能力给AF请求模型联合推理,AF将自身以及UE能力传给MMF,请求MMF进行模型切分,MMF判断如果需要其他UE(s)参与模型联合推理,则MMF请求NWDAF收集其他UE(s)的信息,MMF基于收集的信息进行模型切分并将结果反馈给AF。
可选地,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十七请求中携带的参数,向网络数据分析功能NWDAF发送 第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
将所述第十八请求结果作为所述第一消息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第九消息,所述第九消息中包括所述AI/ML模型切分点、所述第一消息的数据;所述第九消息用于为AF提供向待参与模型联合推理的UE(s)发送第十九请求时携带的参数,所述第十九请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十九请求中携带的参数包括所述第九消息。
本公开实施例中,当UE和AF联合推理模型时,AF请求PCF进行模型切分的策略判断,PCF请求NWDAF收集UE的能力,PCF基于收到的分析结果,进行模型切分点的判断,并向AF反馈结果,AF向相关UE发起联合推理请求。
可选地,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延 需求信息;
根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息
若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第二十请求结果,所述第二十请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第二十请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE(s)。
本公开实施例中,当UE和AF联合推理模型时,UE上报自身能力给AF请求模型联合推理,AF将自身以及UE能力发给PCF,请求PCF进行模型切 分,PCF判断如果需要其他UE(s)参与模型联合推理,则PCF请求NWDAF收集其他UE(s)的信息,PCF基于收集的信息进行模型切分并将结果反馈给AF。
第二方面,本公开提供一种辅助模型切分的方法,所述方法应用于用户设备UE,所述方法包括:
根据自身能力信息,确定AI/ML模型切分点信息;
将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
可选地,所述根据自身能力信息,确定AI/ML模型切分点信息,包括:
根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定模型切分点信息。
可选地,所述确定模型切分点信息,包括:
将所有确定参与执行模型联合推理的UE(s)中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1 预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
本公开实施例中,根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合自身能力信息中的推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或待参与模型联合推理的UE基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向应用功能AF发送第一请求,所述第一请求用于请求与所述AF执行模型联合推理操作;其中,所述第一请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML 模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收AF发送的第一请求结果,所述第一请求结果是由AF根据第一请求中携带的参数确定的,所述第一请求结果包括接受第一请求或不接受第一请求;
其中,若第一请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
本公开实施例中,当UE和AF联合推理模型时,UE基于自身能力进行模型切分点的判断,上报模型切分点(模型切分比例)给AF,并进行联合推理交互。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
通过AF向网络数据分析功能NWDAF发送第二请求,所述第二请求用于请求与所述NWDAF执行模型联合推理操作;其中,所述第二请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二请求结果,所述第二请求结果是由NWDAF根据第一请求中携带的参数确定的,所述第二请求结果包括接受第一请求或不接受第一请求;
其中,若第二请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
本公开实施例中,当UE和NWDAF联合推理模型时,UE基于自身能力进行模型切分点的判断,上报NWDAF,并进行联合推理交互。
第三方面,本公开提供一种辅助模型切分的装置,所述装置应用于网络实体,所述装置包括存储器,收发机,处理器:
存储器,用于存储计算机程序;收发机,用于在所述处理器的控制下收发数据;处理器,用于读取所述存储器中的计算机程序并执行以下操作:
接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
根据所述第一消息,确定AI/ML模型切分的切分结果。
第四方面,本公开提供一种辅助模型切分的装置,所述装置应用于用户设备UE,所述装置包括存储器,收发机,处理器:
存储器,用于存储计算机程序;收发机,用于在所述处理器的控制下收发数据;处理器,用于读取所述存储器中的计算机程序并执行以下操作:
根据自身能力信息,确定AI/ML模型切分点信息;
将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
第五方面,本公开提供一种辅助模型切分的装置,所述装置应用于网络实体,所述装置包括:
接收单元,用于接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
确定单元,用于根据所述第一消息,确定AI/ML模型切分的切分结果。
第六方面,本公开提供一种辅助模型切分的装置,所述装置应用于用户设备UE,所述装置包括:
确定单元,用于根据自身能力信息,确定AI/ML模型切分点信息;
处理单元,用于将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
第七方面,本公开提供一种处理器可读存储介质,所述处理器可读存储介质存储有计算机程序,所述计算机程序用于使所述处理器执行第一方面或第二方面任一项所述的方法。
本公开提供一种辅助模型切分的方法、装置及可读存储介质,通过接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,并根据第一消息中的待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE 可提供的内存、待参与模型联合推理的UE剩余的电量来确定切分AI/ML模型的模型切分点信息,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或待参与模型联合推理的UE基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
应当理解,上述发明内容部分中所描述的内容并非旨在限定本公开的实施例的关键或重要特征,亦非用于限制本公开的范围。本公开的其它特征将通过以下的描述变得容易理解。
附图说明
为了更清楚地说明本公开或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的支持网络数据分析的5GC的网络架构图;
图2为本公开实施例提供的辅助模型切分的方法的第一流程示意图;
图3为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第一信令流程示意图;
图4为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第二信令流程示意图;
图5为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第三信令流程示意图;
图6为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第四信令流程示意图;
图7为本公开实施例二提供的当网络实体是AF时辅助模型切分的方法的第一信令流程示意图;
图8为本公开实施例二提供的当网络实体是AF时辅助模型切分的方法的第二信令流程示意图;
图9为本公开实施例三提供的当网络实体是MMF时辅助模型切分的方法的第一信令流程示意图;
图10为本公开实施例三提供的当网络实体是MMF时辅助模型切分的方法的第二信令流程示意图;
图11为本公开实施例四提供的当网络实体是PCF时辅助模型切分的方法的第一信令流程示意图;
图12为本公开实施例四提供的当网络实体是PCF时辅助模型切分的方法的第二信令流程示意图;
图13为本公开实施例提供的辅助模型切分的方法的第二流程示意图;
图14为本公开实施例提供的辅助模型切分的装置的结构示意图;
图15为本公开另一实施例提供的辅助模型切分的装置的结构示意图;
图16为本公开再一实施例提供的辅助模型切分的装置的结构示意图;
图17为本公开又一实施例提供的辅助模型切分的装置的结构示意图。
具体实施方式
本公开中术语“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,并不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
为了清楚理解本公开的技术方案,首先对现有技术的方案进行详细介绍。现有技术中,在SA#93e通过的SA1R18需求中,需要AI/ML模型传输的场景如下:AI/ML端点之间的AI/ML模型切分。
一个AI/ML模型可以基于当前的任务或者环境切分成多个部分。趋势是将计算复杂,能耗大的部分由网络进行推理,需要隐私保护或者时延敏感的部分在终端推理。例如,终端下载/机载一个模型,先推理特定几层/部分,然后把中间结果发送给网络;网络再执行剩余层/部分,然后把推理结果反馈给终端。趋势是将计算复杂,能耗大的部分由网络进行推理,需要隐私保护或 者时延敏感的部分在终端推理。
该场景中,需要支持和辅助AI/ML模型用户(例如,application client running on the UE,即应用程序客户端运行在UE上)对模型的应用层本地训练,并支持将应用层推理反馈给AI/ML模型提供者(例如,应用功能(英文为:Application Function,简称为:AF))。
但是,现有技术中,无法实现基于终端能力对模型切分的分析,进而无法有效地实现对终端隐私的保护以及网络资源的优化。
发明人进一步研究发现,要基于终端能力对模型切分进行有效地分析,进而确定模型切分点,需要网络实体或者终端基于终端的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点,然后网络实体或者终端将模型切分点信息发送给参与模型联合推理的网络实体或者终端实现基于终端能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
所以基于上述发明人的创造性研究,提出了本公开提出的辅助模型切分的方法,本公开中,网络实体通过接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,并根据第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量来确定切分AI/ML模型的模型切分点信息,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或待参与模型联合推理的UE(s)基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。结合图1所示,图1为本公开实施例提供的支持网络数据分析的5GC的网络架构图,如图1所示,本公开实施例中,网络数据分析功能(英文为:Network data analytic function,简称为:NWDAF)是运营商管理的网络分析功能,NWDAF能够向5G核心网(英文为:5G Core Network,简称为:5GC)中各个网络功能(英文为:Network Function,简称为:NF)(即NF(s))、应用功能(英文为:Application Function,简称为: AF)和操作管理维护(英文为:Operation Administration and Maintenance,简称为:OAM)提供数据分析服务。其中,分析结果可以是历史统计信息或者预测信息。NWDAF可以服务一个或多个网络切片。
其中,在5GC中还包括其他多种功能。分别为用户平面功能(英文为:User Plane Function,简称为:UPF)、会话管理功能(英文为:Session Management Function,简称为:SMF)、接入和移动性管理功能(英文为:Access and Mobility Management Function,简称为:AMF)、统一数据库(英文为:Unified Data Repository,简称为:UDR)、网络能力开放功能(英文为:Network Exposure Function,简称为:NEF)、AF、策略控制功能(英文为:Policy Control Function,简称为:PCF)及在线计费系统(英文为:Online Charging System,简称为:OCS)。其中,这些其他功能均可统称为NF。NWDAF基于服务化接口与5G核心网中其他功能实体5GC NF(s)及OAM进行通信。
5GC中可以有不同NWDAF实例提供不同类型的专用分析。为了让消费者NF能够发现合适的NWDAF实例来提供特定类型的分析,NWDAF实例需在向网络数据库功能(英文为:Network Repository Function,简称为:NRF)注册时提供其支持Analytic ID,Analytic ID代表了分析类型标识(这样消费者NF可以在向NRF查询NWDAF实例时,提供Analytic ID来指示需要何种类型的分析。5GC网络功能和OAM决定如何使用网络数据分析功能NWDAF提供的数据分析来提高网络性能。
本公开实施例中,在一种应用场景中,网络实体或者终端基于终端的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点,网络实体或者终端将模型切分点信息发送给参与模型联合推理的网络实体或者终端。
其中,可提供的算力可以是终端的剩余运存。
因此,网络实体或终端根据终端可提供的算力、终端可提供的内存、终端剩余的电量来确定切分AI/ML模型的模型切分点信息,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或终端基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE(s)) 能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
需要说明的是,UE(s)可以为一个UE或多个UE(或一组UE或任意UEs),其UE的数量可以根据具体场景确定。此外,下述实施例中针对UE上报的可提供的算力、可提供的内存、剩余的电量等场景,对于UE侧来说,是每个UE进行上报,其上报的参数为该UE自身的参数(比如该UE可提供的算力、可提供的内存、剩余的电量);对于接收侧来说,接收每个UE上报的参数,即接收到的是各个UE上报的参数(结合来说,接收到的参数有各个UE(或UE(s))可提供的算力、可提供的内存、剩余的电量)。
以下将参照附图来描述本公开的实施例。
图2为本公开实施例提供的辅助模型切分的方法的第一流程示意图,如图2所示,本实施例提供的辅助模型切分的方法的执行主体为网络实体,该网络实体可以是NWDAF、AF、新的网络实体功能(例如,英文为:Model Management Function,简称为:MMF)功能、PCF中的任一个。本公开实施例提供的辅助模型切分的方法包括以下步骤:
步骤101、网络实体接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息。
可选地,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
其中,接收第一消息的具体方式可以依据具体的网络实体在相应的场景下确定,在此不做具体地限定。
步骤102、网络实体根据所述第一消息,确定AI/ML模型切分的切分结果。
可选地,所述根据所述第一消息,确定AI/ML模型切分的切分结果,可以通过以下步骤实现:
步骤a1、根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
步骤a2、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
步骤a3、将所述模型切分点信息作为所述切分结果。
具体地,网络实体可以基于终端的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点(即模型切分点信息或AI/ML模型切分点信息),然后网络实体将模型切分点信息发送给参与模型联合推理的网络实体或者终端。
可选地,所述确定所述待参与模型联合推理的UE(s)对应的模型切分点信息,可以通过以下步骤实现:
步骤b1、将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
步骤b2、若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
步骤b3、若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
步骤b4、若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值 并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
步骤b5、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
步骤b6、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
步骤b7、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
步骤b8、根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
具体地,参与模型联合推理的每个UE都需要确定其推理的层数或部分,针对每个UE(即目标UE)可以通过以下过程实现切分点的确定:
目标UE的可提供的算力和内存较少、剩余的电量较低时,只能做较少层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理。例如,假设对于一个8层的AI/ML模型,UE1可以做两层的推理(比如,UE1做前两层的推理)、UE2可以做三层的推理(比如,UE2做中间三层的 推理),AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
本实施例中,网络实体通过接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,并根据第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量来确定切分AI/ML模型的模型切分点信息,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络(AF)和待参与模型联合推理的UE(s)基于切分结果对模型进行联合推理,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
示例性地,实施例一(网络实体为NWDAF,NWDAF进行模型切分点的判断),下述以至少四种场景为例对辅助模型切分的方法进行详细说明。
场景11:当UE(s)和AF(s)联合推理模型(s)时,AF请求NWDAF收集UE的能力进行模型切分点的判断,NWDAF收集并判断后将结果分别反馈给AF,AF发给相关的UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,可以包括以下步骤:
步骤c11、直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延 需求信息;
步骤c12、根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤c13、接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第三消息,所述第三消息中包括AI/ML模型切分点信息以及所述第一数据;所述第三消息用于为AF提供向待参与模型联合推理的UE(s)发送第二请求时携带的参数,所述第二请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第二请求中携带的参数包括所述第三消息。
具体地,参见图3,图3为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第一信令流程示意图。具体步骤为:(301a是AF在受信任区域,301b、301c是AF在不受信任区;步骤305a是AF在受信任区域,305b,305c是AF在不受信任区。)
步骤301a,AF向NWDAF发送Nnwdaf_MLModelSplit_Request(即机器学习模型切分请求),请求中包含和Model ID关联的Analytics ID(即分析类型标识)=MLModelSplit,请求NWDAF收集UE(即待参与模型联合推理的UE(s))的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及可选的包含模型不同层的时延需求等信息,如下表1:
表1
Figure PCTCN2022131852-appb-000001
步骤301b,AF向NEF发送Nnef_MLModelSplit_Request(即机器学习模型切分请求),请求中包含上述表1中的信息。
步骤301c,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述表1中的信息。
步骤302,NWDAF调用Nnf_EventExposure_Subscribe(即事件开放订阅) 向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果终端通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈(参见表2)给NF(例如,AMF/SMF)。
表2
Figure PCTCN2022131852-appb-000002
步骤303,5GC NF(s)调用Nnf_EventExposure_Notify(即事件开放通知)向NWDAF反馈所需数据。
步骤304,NWDAF执行分析,基于UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求,选出模型切分点(或者模型切分比例)。
例如,UE的可提供的算力和内存较少、剩余的电量较低时,只能做较少 层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理;例如,假设对于一个8层的ML模型,UE1可以做前两层的推理、UE2做中间三层的推理,AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
305a,通过Nnwdaf_MLModelSplit_Request Response(即机器学习模型切分请求的响应)将模型切分点(或者模型切分比例)等信息(如下表3)发送给AF。
表3
Figure PCTCN2022131852-appb-000003
步骤305b,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息。
步骤305c,NEF授权后,向NWDAF发送Nnef_MLModelSplit_Request response(即机器学习模型切分请求的响应),内容包含上述模型切分点(或 者模型切分比例)等的信息。
步骤306,AF和相关UE(s)建立连接。
步骤307,AF向相关UE(s)发送模型联合推理请求Naf_MLModelJointInference_Request(即模型联合推理请求),请求中包含模型切分点(或者模型切分比例)信息。
步骤308,UE(s)向AF发送模型联合推理请求的响应Naf_MLModelJointInference_Request response(即模型联合推理请求的响应),表示是否接受这个模型联合推理请求。
场景12:当UE(s)和AF(s)联合推理(多个)模型时,UE上报自身的能力给AF,AF请求NWDAF进行模型切分点的判断,NWDAF把判断结果反馈给AF,AF发送给UE。
可选的,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,可以通过包括以下步骤:
直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;将所述第三请求中携带的参数作为所述第一消息;
其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第四消息,所述第四消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第四消息用于为AF提供向待参与模型联合推理的UE(s)发送的第四请求结果,所述第四请求结果中包括所述AI/ML模型切分点信息。
具体地,参见图4,图4为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第二信令流程示意图。具体步骤为:(404a是AF在受信任区域,404b、404c是AF在不受信任区;步骤405a是AF在受信任区域,405b、405c是AF在不受信任区。)
步骤400,AF和UE建立连接
步骤401,UE发起和AF进行模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit、UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。
步骤402,AF判断这个联合推理请求。
步骤403a,AF向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求等信息,并请求NWDAF进行模型切分点的判断。
步骤403b,AF向NEF发送Nnef_MLModelSplit_Request,请求中包含上述信息。
步骤403c,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述信息。
步骤404,NWDAF执行分析,基于UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求,选出模型切分点(或者模型切分比例)。
例如,UE的可提供的算力、可提供的内存、剩余的电量较低时,只能做 较少层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理。例如,假设对于一个8层的ML模型,UE1可以做前两层的推理、UE2做中间三层的推理,AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
步骤405a,NWDAF通过Nnwdaf_MLModelSplit_Request Response将模型切分点(或者模型切分比例)等信息发送给AF。
步骤405b,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)。
步骤505c,NEF授权后,向NWDAF发送Nnef_MLModelSplit_Request response,内容包含上述模型切分点(或者模型切分比例)。
步骤406,AF把模型联合推理请求的响应Naf_MLModelJointInference_Request response发送给UE,响应中包含模型切分点(或者模型切分比例)。
场景13、当UE(s)和AF(s)联合推理模型时,UE上报自身的能力给AF,AF请求NWDAF进行模型切分点的判断并请求其他可参与模型推理的UE(s)参与本模型的推理,NWDAF把收集和判断结果反馈给AF,AF发送给UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,可以包括以下步骤:
步骤c31、直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参 数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
步骤c32、根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤c33、接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第六消息,所述第六消息中至少包括所述AI/ML模型切分点信息以及所述第二数据;所述第六消息用于为AF提供向待参与模型联合推理的UE(s)发送的第六请求结果,所述第六请求结果中包括所述待参与模型联合推理的UE(s)对应的所述AI/ML模型切分点信息、所述其他可参与模型推理的UE(s)以及对应的所述AI/ML模型切分点信息。
具体地,参见图5,图5为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第三信令流程示意图。具体步骤为:(步骤503a是 AF在受信任区域,步骤503b、步骤503c是AF在不受信任区;步骤507a是AF在受信任区域,步骤507b、步骤507c是AF在不受信任区。)
步骤500-501同场景12的实施例中的步骤400-401所描述。(即:
步骤500,AF和UE建立连接。
步骤501,UE发起和AF进行模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。步骤502,AF判断联合推理请求,如果判断UE和AF自身能力不够做联合推理,则认为需要其他UE(s)帮忙进行模型切分。
步骤503a,AF向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求等信息,以及请求NWDAF帮忙发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等并请求NWDAF进行模型切分点的判断。
步骤503b,AF向NEF发送Nnef_MLModelSplit_Request,请求中包含上述信息。
步骤503c,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述信息。
步骤步骤504-步骤505同场景11的实施例中的步骤302-303所描述(即:
步骤504,NWDAF调用Nnf_EventExposure_Subscribe向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈给 NF(例如,AMF/SMF)。
步骤505,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所需数据)。
步骤506,NWDAF执行分析,并基于output data中的可参与模型切分的列表中的UE(s),以及和他们对应的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求,选出模型切分点(或者模型切分比例)。
例如,UE的可提供的算力和内存较少、剩余的电量较低时,只能做较少层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理;例如,假设对于一个8层的ML模型,UE1可以做前两层的推理、UE2做中间三层的推理,AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
步骤507(即步骤507a-507c)同场景11的实施例中的步骤305(即步骤305a-305c)所描述(即:
步骤507a,通过Nnwdaf_MLModelSplit_Request Response将模型切分点(或者模型切分比例)等信息发送给AF,如表3。
步骤507b,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息。
步骤507c,NEF授权后,向NWDAF发送Nnef_MLModelSplit_Request response,内容包含上述模型切分点(或者模型切分比例)等的信息)。
步骤508,AF把模型联合推理请求的响应Naf_MLModelJointInference_Request response发送给UE,响应中包含模型切分点(或者模型切分比例)以及其他可参与模型推理的UE(s)以及其相关的模型切分点(或者模型切分比例)信息。
场景14:当UE和NWDAF联合推理模型时,UE通过AF上报自身的能力NWDAF,同时请求NWDAF进行模型切分点的判断,NWDAF把判断结 果通过AF反馈给UE。
可选地,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,可以包括以下步骤:
步骤c41、接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
步骤c42、若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息待参与AI/ML模型切分的UE(s)待参与模型联合推理的UE(s)待参与模型联合推理的UE(s)待参与模型联合推理的UE(s);
步骤c43、若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤c44、接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤c45、将所述第七请求中携带的参数以及所述查找结果作为所述第一消息;
其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述待参与模型联合推理的UE(s)发送第七请求结果,所述第七请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第七请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE。
具体地,参见图6,图6为本公开实施例一提供的当网络实体是NWDAF时辅助模型切分的方法的第四信令流程示意图。具体步骤为:
步骤600,AF和UE建立连接。
步骤601,UE发起和NWDAF进行模型联合推理请求Nnwdaf_MLModelJointInference_Request(即模型联合推理请求),请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供;通过AF透传给NWDAF。
步骤602,NWDAF判断联合推理请求。
可选地,步骤603-604,如果需要,通过NF发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等。(即:
步骤603,NWDAF调用Nnf_EventExposure_Subscribe向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈给NF(例如,AMF/SMF).
步骤604,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所 需数据。)
步骤605,NWDAF基于收集到的信息以及自身拥有的模型执行分析,进行模型切分点(或者模型切分比例)的判断。
步骤606,NWDAF向UE反馈AI/ML模型联合推理请求的响应Nnwdaf_MLModelJointInference_Request response(即模型联合推理请求的响应),请求中包含模型切分点(或者模型切分比例)信息,如果有其他参与联合模型推理的UE(s),则包含相应UE(s)的模型切分点(或者模型切分比例)信息,通过AF透传给UE。
示例性地,实施例二(网络实体是AF,AF进行模型切分点的判断。),下述以至少两种场景为例对辅助模型切分的方法进行详细说明。
场景21:当UE和AF联合推理模型时,UE上报自身的能力给AF,AF进行联合推理和模型切分点的判断,然后把判断结果反馈给UE。
可选地,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,可以包括以下步骤:
步骤d11、接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
步骤d12、若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
步骤d13、若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于 请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;(其中,所述第九请求结果是由NWDAF通过向5GC NF(s)采集UE(s)对应的第三数据确定的;所述第三数据中包括下述至少一项:参与AI/ML模型切分的UE或者SUPI、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的算力、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的内存、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))剩余的电量、AI/ML模型的大小;若第九请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第三数据中还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第九请求结果包括下述至少一项:第三数据、其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。)
步骤d14、接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,在确定AI/ML模型切分的切分结果之后,所述方法还包括:
将确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力的结果、所述待参与模型联合推理的UE(s)的相关信息以及所述待参与模型联合推理的UE(s)对应的模型切分点信息发送给所述UE;
其中,所述相关信息包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述相关信息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
具体地,参见图7,图7为本公开实施例二提供的当网络实体是AF时辅助模型切分的方法的第一信令流程示意图。具体步骤为:(步骤703a的是AF在受信任区域,703b、703c是AF在不受信任区;步骤706a是AF在受信任区域,706b、706c是AF在不受信任区。)
步骤700-701同场景13的实施例中的步骤500-501所描述(即:
步骤700,AF和UE建立连接。
步骤701,UE发起和AF进行模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;(如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。
步骤702,AF判断联合推理请求,如果需要其他UE(s)参与联合推理模型,则执行步骤703(即步骤703a-703c)-706,否则跳过。
步骤703(即步骤703a-703c)-706(即步骤706a-706c)如场景13的实施例中的步骤503(即步骤503a-503c)-505、步骤507(即步骤507a-507c)所描述(即:703,AF向NWDAF发送Nnwdaf_AnalyticsSubscription_Subscribe(即分析订阅订阅)。实施过程:
步骤703a,AF向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以 及模型不同层的时延需求等信息,以及请求NWDAF帮忙发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等并请求NWDAF进行模型切分点的判断。
步骤703b,AF向NEF发送Nnef_MLModelSplit_Request,请求中包含上述步骤703a的信息。
步骤703c,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述步骤703a信息。
步骤704,NWDAF调用Nnf_EventExposure_Subscribe向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈给NF(例如,AMF/SMF)。
步骤705,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所需数据)。
步骤706(即步骤706a-706c)如场景13的实施例中的步骤507(即步骤507a-507c)所描述(即:步骤706,NWDAF向AF发送Nnwdaf_AnalyticsSubscription_Notify(即分析订阅通知)。实施过程:
步骤706a,通过Nnwdaf_MLModelSplit_Response将模型切分点(或者模型切分比例)等信息发送给AF,如表3。
步骤706b,NWDAF向NEF发送Nnwdaf_MLModelSplit_Response,内容包含上述模型切分点(或者模型切分比例)等的信息。
步骤706c,NEF授权后,向NWDAF发送Nnef_MLModelSplit_Request response,内容包含上述模型切分点(或者模型切分比例)等的信息)。)
步骤707,AF基于接收到的信息以及自身的模型和能力,判断相关UE的模型切分点(模型切分比例)。
步骤708,AF通过Nnf_MLModelJointInference_Request response将判断结果和相关UE信息以及对应模型切分点(模型切分比例)信息发送给UE.该信息也可以实现UE和UE之间进行联合模型推理。
场景22、当UE和AF联合推理模型时,AF请求NWDAF收集UE的能力,AF基于收到的分析结果,进行模型切分点的判断,并向相关UE发起联合推理请求。
可选地,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
步骤d21、向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤d22、接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤d23、将所述第十请求结果作为所述第一消息。
可选地,在确定AI/ML模型切分的切分结果之后,所述方法还包括:
步骤d24、向待参与模型联合推理的UE(s)发送第十一请求,所述第十一请求用于请求与所述待参与模型联合推理的UE(s)执行模型联合推理操作;其中,所述第十一请求中携带的参数包括下述至少一项:模型切分点信息、所述第一消息;
步骤d25、接收待参与模型联合推理的UE(s)发送的第十一请求结果,所 述第十一请求结果是由待参与模型联合推理的UE(s)根据第十一请求中携带的参数确定的,所述第十一请求结果包括接受第十一请求或不接受第十一请求。
具体地,参见图8,图8为本公开实施例二提供的当网络实体是AF时辅助模型切分的方法的第二信令流程示意图。具体步骤为:(步骤801a是AF在受信任区域,步骤801b、步骤801c是AF在不受信任区;步骤804a是AF在受信任区域,步骤804b、步骤804c是AF在不受信任区。)
步骤801(即801a-801c)-804(即804a-804c),AF请求NWDAF收集UE的能力并获得分析结果如场景11的实施例中的步骤301(即301a-301c)~303、305(即305a-305c)所描述(即:
步骤801a,AF向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含和Model ID关联的Analytics ID(即分析类型标识)=MLModelSplit,请求NWDAF收集UE(即待参与模型联合推理的UE(s))的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及可选的包含模型不同层的时延需求等信息,如表1。
步骤801b,AF向NEF发送Nnef_MLModelSplit_Request,请求中包含上述表1中的信息。
步骤801c,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述表1中的信息。
步骤802,NWDAF调用Nnf_EventExposure_Subscribe(即事件开放订阅)向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果终端通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈(参见表2)给NF(例如,AMF/SMF)。
步骤803,5GC NF(s)调用Nnf_EventExposure_Notify(即事件开放通知)向NWDAF反馈所需数据。
步骤804a,通过Nnwdaf_MLModelSplit_Request Response将模型切分点 (或者模型切分比例)等信息(如表3)发送给AF。
步骤804b,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息(参见表3)。
步骤804c,NEF授权后,向NWDAF发送Nnef_MLModelSplit_Request response,内容包含上述模型切分点(或者模型切分比例)等的信息(参见表3)。)
步骤805,AF基于接收到的信息以及自身的模型和能力,进行模型切分点判断。
步骤806-808,AF和相关UE(s)建立连接并进行模型联合推理请求的响应如场景11的实施例中的步骤306-308所示(即:
步骤806,AF和相关UE(s)建立连接。
步骤807,AF向相关UE(s)发送模型联合推理请求Naf_MLModelJointInference_Request,请求中包含模型切分点(或者模型切分比例)信息。
步骤808,UE(s)向AF发送模型联合推理请求响应Naf_MLModelJointInference_Request response,表示是否接受这个模型联合推理请求。)。
示例性地,实施例三(网络实体为MMF(MMF可以进行模型的管理;也可以进行模型切分),MMF进行模型切分点的判断),下述以至少两种场景为例对辅助模型切分的方法进行详细说明。
场景31、当UE和AF联合推理模型时,AF请求新的网络实体MMF进行模型切分,MMF请求NWDAF收集UE的能力,MMF基于收到的分析结果,进行模型切分点的判断,并向AF反馈结果,AF向相关UE发起联合推理请求。
可选地,如果所述网络实体是新的网络实体MMF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
步骤e11、直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析 类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤e12、根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分的数据;
步骤e13、接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤e14、将所述第十三请求结果作为所述第一消息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第八消息,所述第八消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第八消息用于为AF提供向待参与模型联合推理的UE(s)发送第十四请求时携带的参数,所述第十四请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十四请求中携带的参数包括所述第八消息。
具体地,参见图9,图9为本公开实施例三提供的当网络实体是MMF时辅助模型切分的方法的第一信令流程示意图。具体步骤为:
步骤901,AF向MMF发送Nmmf_MLModelSplit_Request(即机器学习模型切分请求),请求中包含的参数同场景11的实施例中的步骤301的请求 包含的参数(即:请求中包含和Model ID关联的Analytics ID=MLModelSplit,请求NWDAF收集UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及可选的包含模型不同层的时延需求等信息,如表1。)
步骤902-905如场景11的实施例中的步骤301c,302,303,305b所描述(即:
步骤902,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述表1中的信息。
步骤903,NWDAF调用Nnf_EventExposure_Subscribe(即事件开放订阅)向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果终端通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈(参见表2)给NF(例如,AMF/SMF)。
步骤904,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所需数据。
步骤905,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息)
步骤906,MMF基于NWDAF的分析输出结果output data(参见表3),即基于UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求,选出模型切分点(或者模型切分比例)。
步骤907,MMF向AF通过Nmmf_MLModelSplit_Request response(机器学习模型切分请求的响应)发送模型切分点(模型切分比例)等信息
步骤908-9010如场景11的实施例中的步骤306-308所描述(即:
步骤908,AF和相关UE(s)建立连接。
步骤909,AF向相关UE(s)发送模型联合推理请求Naf_MLModelJointInference_Request,请求中包含模型切分点(或者模型切分比例)信息。
步骤9010,UE(s)向AF发送模型联合推理请求的响应Naf_MLModelJointInference_Request response,表示是否接受这个模型联合推理请求)。
场景32、当UE和AF联合推理模型时,UE上报自身能力给AF请求模型联合推理,AF将自身以及UE能力传给MMF,请求MMF进行模型切分,MMF判断如果需要其他UE(s)参与模型联合推理,则MMF请求NWDAF收集其他UE(s)的信息,MMF基于收集的信息进行模型切分并将结果反馈给AF。
可选地,如果所述网络实体是新的网络实体MMF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
步骤e21、直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤e22、根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
步骤e23、若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
步骤e24、若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤e25、接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。(其中,所述第十六请求结果是由NWDAF通过向5GC NF(s)采集UE(s)对应的第四数据确定的;所述第四数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的算力、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的内存、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四数据中还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第十六请求结果中包括下述至少一项:第四数据、其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。)
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第十五请求结果,所述第十五请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第十五请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述待参与模型联合推理的UE(s)。
具体地,参见图10,图10为本公开实施例三提供的当网络实体是MMF时辅助模型切分的方法的第二信令流程示意图。具体步骤为:
步骤1000-1001如场景13的实施例中的步骤500-501所描述(即:
步骤1000,AF和UE建立连接。
步骤1001,UE发起和AF进行ML模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;(如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。
步骤1002,AF向MMF发送模型切分请求Nmmf_MLModelSplit_Request,请求中包含的信息如场景13的实施例中的步骤503a所描述(即:
请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求等信息,以及请求NWDAF帮忙发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等。)
步骤1003,MMF判断联合推理请求,判断如果需要其他UE(s)参与模型联合推理,则执行步骤1004-1007,否则跳过。
步骤1004-1007如场景13的实施例中的步骤503c、504、505、507b所描述(即:
步骤1004,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述信息。
步骤1005,NWDAF调用Nnf_EventExposure_Subscribe向5GC NF(s)(例如AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈给NF(例如,AMF/SMF).
步骤1006,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈 所需数据;
步骤1007,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息)
步骤1008,MMF基于收集到的信息进行模型切分的判断
步骤1009,MMF将结果通过Nmmf_MLModelSplit_Request response发送给AF,响应中包含的信息如场景11的实施例中的步骤305所描述(即:模型切分点(或者模型切分比例)等信息,如表3)
步骤1010如场景13的实施例中的步骤508所描述。(即:
步骤1010,AF把模型联合推理请求的响应Naf_MLModelJointInference_Request response发送给UE,响应中包含模型切分点(或者模型切分比例)以及其他可参与模型推理的UE(s)以及其相关的模型切分点(或者模型切分比例)信息)。
示例性地,实施例四(网络实体为PCF,PCF进行模型切分点的判断),下述以至少两种场景为例对辅助模型切分的方法进行详细说明。
场景41、当UE和AF联合推理模型时,AF请求PCF进行模型切分的策略判断,PCF请求NWDAF收集UE的能力,PCF基于收到的分析结果,进行模型切分点的判断,并向AF反馈结果,AF向相关UE发起联合推理请求。
可选地,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
步骤f11、直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤f12、根据所述第十七请求中携带的参数,向网络数据分析功能 NWDAF发送第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤f13、接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
步骤f14、将所述第十八请求结果作为所述第一消息。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第九消息,所述第九消息中包括所述AI/ML模型切分点、所述第一消息的数据;所述第九消息用于为AF提供向待参与模型联合推理的UE(s)发送第十九请求时携带的参数,所述第十九请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十九请求中携带的参数包括所述第九消息。
具体地,参见图11,图11为本公开实施例四提供的当网络实体是PCF时辅助模型切分的方法的第一信令流程示意图。具体步骤为:
步骤1101,AF向PCF发送Npcf_MLModelSplit_Request(即机器学习模型切分请求),请求中包含的参数同场景11的实施例中的步骤301的请求(即:请求中包含和Model ID关联的Analytics ID=MLModelSplit,请求NWDAF收集UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及可选的包含模型不同层的时延需求等信息,如表1。)
步骤1102-1105如场景11的实施例中的步骤301c,302,303,305b所描述(即:
步骤1102,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述表1中的信息。
步骤1103,NWDAF调用Nnf_EventExposure_Subscribe(即事件开放订 阅)向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果终端通过这个请求)就准备和该Analytics ID(Model ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈(参见表2)给NF(例如,AMF/SMF)。
步骤1104,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所需数据。
步骤1105,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息)。
步骤1106,PCF基于NWDAF的分析输出结果output data,即基于UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求,选出模型切分点(或者模型切分比例)。
例如,UE的可提供的算力和内存较少、剩余的电量较低时,只能做较少层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理;例如,假设对于一个8层的AI/ML模型,UE1可以做两层的推理(比如,UE1做前两层的推理)、UE2可以做三层的推理(比如,UE2做中间三层的推理),AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
步骤1107,PCF向AF通过Npcf_MLModelSplit_Request response(即机器学习模型切分请求的响应)发送模型切分点(模型切分比例)等信息。
步骤1108-1110如场景11的实施例中的步骤306-308所描述(即:
步骤1108,AF和相关UE(s)建立连接。
步骤1109,AF向相关UE(s)发送模型联合推理请求Naf_MLModelJointInference_Request,请求中包含模型切分点(或者模型切分 比例)信息。
步骤1110,UE(s)向AF发送模型联合推理请求的响应Naf_MLModelJointInference_Request response,表示是否接受这个模型联合推理请求)。
场景42、当UE和AF联合推理模型时,UE上报自身能力给AF请求模型联合推理,AF将自身以及UE能力发给PCF,请求PCF进行模型切分,PCF判断如果需要其他UE(s)参与模型联合推理,则PCF请求NWDAF收集其他UE(s)的信息,PCF基于收集的信息进行模型切分并将结果反馈给AF。
可选地,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,可以包括以下步骤:
步骤f21、直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤f22、根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE是否具备完全支持执行模型联合推理的能力;
步骤f23、若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
步骤f24、若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数 据集的隐私等级和/或模型不同层的时延需求信息;
步骤f25、接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;(所述第二十一请求结果是由NWDAF通过向5GC NF(s)采集UE(s)对应的第六数据确定的;所述第六数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的算力、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))可提供的内存、待参与模型联合推理的UE(s)(这里的待参与模型联合推理的UE(s)为上述的其他可参与模型推理的UE(s))剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第六数据中还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第二十一请求结果中包括下述至少一项:第六数据、其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。)
步骤f26、若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
向所述AF发送第二十请求结果,所述第二十请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第二十请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透 传给所述UE。
具体地,参见图12,图12为本公开实施例四提供的当网络实体是PCF时辅助模型切分的方法的第二信令流程示意图。具体步骤为:
步骤1200-1201如场景13的实施例中的步骤500-501所描述(即:
步骤1200,AF和UE建立连接
步骤1201,UE发起和AF进行ML模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;(如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。
步骤1202,AF向PCF发送模型切分请求Npcf_MLModelSplit_Request,请求中包含的信息如场景13的实施例中的步骤503a所描述(即请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求等信息,以及请求NWDAF帮忙发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等)
步骤1203,基于接收到的信息,例如,UE、AF报告的自身能力等,PCF判断联合推理请求,判断如果需要其他UE(s)参与模型联合推理,则执行步骤1204-1207,否则跳过。
步骤1204-1207如场景13的实施例中的步骤503c、504、505、507b所描述(即:
步骤1204,NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述信息。
步骤1205,NWDAF调用Nnf_EventExposure_Subscribe向5GC NF(s)(例如,AMF/SMF)采集UE(s)的可提供的算力、可提供的内存、剩余的电量,(如果有)推理此模型需要的数据集的隐私等级,以及(如果能提供)模型不同层的时延需求;5GC NF(s)(例如,AMF/SMF)将这些请求信息发送给终端,终端收到这个请求,(如果通过这个请求)就准备和该Analytics ID(Model  ID)对应的电量、内存、算力,(如果有)数据集的隐私等级等信息反馈给NF(例如,AMF/SMF)。
步骤1206,5GC NF(s)调用Nnf_EventExposure_Notify向NWDAF反馈所需数据。
步骤1207,NWDAF向NEF发送Nnwdaf_MLModelSplit_Request Response,内容包含上述模型切分点(或者模型切分比例)等的信息。)
步骤1208,PCF基于收集到的信息进行模型切分的判断。
步骤1209,PCF将结果通过Nmmf_MLModelSplit_Request response发送给AF,响应中包含的信息如场景11的实施例中的步骤305所描述(即:模型切分点(或者模型切分比例)等信息,如表3。)
步骤1210如场景13的实施例中的步骤508所描述。(即:
步骤1210,AF把模型联合推理请求的响应Naf_MLModelJointInference_Request response发送给UE,响应中包含模型切分点(或者模型切分比例)以及其他可参与模型推理的UE(s)以及其相关的模型切分点(或者模型切分比例)信息)。
本公开实施例,通过PCF接收AF发送的模型切分请求,以及请求中包含的参数;PCF进行模型切分点的判断,如果需要,从NWDAF请求分析结果。或者,通过AF接收UE发送的模型切分请求,以及请求中包含的参数;AF基于自身信息或者从NWDAF请求得到的分析结果进行模型切分点的判断;AF向UE发送分析和判断结果。或者,通过引入新的模型管理网络实体MMF,MMF接收AF发送的模型切分请求,以及请求中包含的参数MMF进行模型切分点的判断,如果需要,从NWDAF请求分析结果。或者,NWDAF接收AF/终端/PCF/新网络实体等发送的模型切分请求,以及请求中包含的参数;NWDAF从5GC NF(s)采集的输入数据,用于模型切分的分析;(如果其他不能做,NWDAF进行分析和模型切分判断),向AF/终端/PCF/新网络实体等发送分析和判断结果。因此,网络实体基于终端的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点,网络实体将模型切分点信息发送给参与模型联合推理的网络实体或者终端。基于终端能力实现模型切分推理,从而有益于保护终端隐私,优化网络资源。
图13为本公开实施例提供的辅助模型切分的方法的第二流程示意图,如图13所示,本实施例提供的辅助模型切分的方法的执行主体为用户设备(或终端)UE,则本公开实施例提供的辅助模型切分的方法包括以下步骤:
步骤201、UE根据自身能力信息,确定AI/ML模型切分点信息。
可选地,根据自身能力信息,确定AI/ML模型切分点信息,可以包括以下步骤:
步骤g1、根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;
步骤g2、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定模型切分点信息。
步骤202、UE将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
具体地,终端可以基于自身可提供的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点(即模型切分点信息或AI/ML模型切分点信息),然后终端将模型切分点信息发送给参与模型联合推理的网络实体或者终端。
可选地,确定模型切分点信息,可以包括以下步骤:
将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
步骤h1、若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
步骤h2、若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
步骤h3、若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
步骤h4、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
步骤h5、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
步骤h6、若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
步骤h7、根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
具体地,参与模型联合推理的每个UE都需要确定其推理的层数或部分,针对每个UE(即目标UE)可以通过以下过程实现切分点的确定:
目标UE的可提供的算力和内存较少、剩余的电量较低时,只能做较少层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且隐私等级需求较高和/或模型不同层的时延需求较低时做较少层的推理;如果UE的可提供的算力和内存充足、剩余的电量较高时,可以做较多层的推理,如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的 时延,且隐私等级需求较低和/或模型不同层的时延需求较高时做较多层的推理。例如,假设对于一个8层的AI/ML模型,UE1可以做两层的推理(比如,UE1做前两层的推理)、UE2可以做三层的推理(比如,UE2做中间三层的推理),AF做最后三层的推理,即UE1对应模型切分点1,UE2对应模型切分点2。
本实施例中,根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;如果推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还可以结合自身能力信息中的推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息来确定切分AI/ML模型的模型切分点信息,作为切分结果,用以实现网络实体或待参与模型联合推理的UE(s)基于切分结果对模型进行联合推理操作,因此,实现了基于终端(即UE)能力对模型切分的分析,进而有效地实现对终端隐私的保护以及网络资源的优化。
示例性地,实施例五(UE进行模型切分点的判断),下述以至少两种场景为例对辅助模型切分的方法进行详细说明。
场景51、当UE和AF联合推理模型时,UE基于自身能力进行模型切分点的判断,上报模型切分点(模型切分比例)给AF,并进行联合推理交互。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还可以包括以下步骤:
步骤i1、向应用功能AF发送第一请求,所述第一请求用于请求与所述AF执行模型联合推理操作;其中,所述第一请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤i2、接收AF发送的第一请求结果,所述第一请求结果是由AF根据 第一请求中携带的参数确定的,所述第一请求结果包括接受第一请求或不接受第一请求;
其中,若第一请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
具体地,实施过程为:
步骤1401,在场景12的实施例中的步骤401(即:
UE发起和AF进行ML模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;(如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。请求中还增加模型切分点(模型切分比例)信息上报给AF。
步骤1402,如场景22的实施例中的步骤806-808所示,UE和AF建立联合模型推理。(即:
AF和相关UE(s)建立连接。
AF向相关UE(s)发送模型联合推理请求Naf_MLModelJointInference_Request,请求中包含模型切分点(或者模型切分比例)信息。
UE(s)向AF发送模型联合推理请求的响应Naf_MLModelJointInference_Request response,表示是否接受这个模型联合推理请求。)
场景52、当UE和NWDAF联合推理模型时,UE基于自身能力进行模型切分点的判断,上报NWDAF,并进行联合推理交互。
可选地,所述确定AI/ML模型切分的切分结果之后,所述方法还可以包括以下步骤:
步骤j1、通过AF向网络数据分析功能NWDAF发送第二请求,所述第二请求用于请求与所述NWDAF执行模型联合推理操作;其中,所述第二请 求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
步骤j2、接收NWDAF发送的第二请求结果,所述第二请求结果是由NWDAF根据第一请求中携带的参数确定的,所述第二请求结果包括接受第一请求或不接受第一请求;
其中,若第二请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
具体地,实施过程为:
步骤1501,在场景12的实施例中的步骤401(即:
UE发起和AF进行ML模型联合推理请求Naf_MLModelJointInference_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE自身的可提供的算力、可提供的内存、剩余的电量等信息;如果有推理AI/ML模型时需要数据集的隐私等级,则请求中还包含推理AI/ML模型需要的数据集的隐私等级,如果没有,基于SA3确认是否可以设定;(如果有推理AI/ML模型时需要模型不同层的时延需求信息,则请求中还包含模型不同层的时延需求信息,如果没有,由AF提供。请求中还增加模型切分点(模型切分比例)信息上报给AF。
步骤1502,AF通过场景13的实施例中的步骤503(即:
AF向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含和Model ID关联的Analytics ID=MLModelSplit,UE的可提供的算力、可提供的内存、剩余的电量,以及推理此模型需要的数据集的隐私等级,以及模型不同层的时延需求等信息,以及请求NWDAF帮忙发现其他可参与模型推理的UE(s)以及其相关的可提供的算力、可提供的内存、剩余的电量等(若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,还请求提供其 他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息)并请求NWDAF进行模型切分点的判断;
AF向NEF发送Nnef_MLModelSplit_Request,请求中包含上述信息NEF授权后,向NWDAF发送Nnwdaf_MLModelSplit_Request,请求中包含上述信息)将信息发送给NWDAF)。
步骤1503,NWDAF和UE建立联合模型推理。
本公开实施例,通过UE发送模型联合推理请求,以及请求中包含的参数;UE接收从NWDAF/AF获得模型联合推理响应,以及响应中包含的参数。因此,终端基于可提供的电量、可提供的内存、可提供的算力等信息,以及关于此模型的时延需求以及关于此模型需要的推理数据集的隐私等级等,选择出模型切分点,终端将模型切分点信息发送给参与模型联合推理的网络实体或者终端。基于终端能力实现模型切分推理,从而有益于保护终端隐私,优化网络资源。
图14为本公开实施例提供的辅助模型切分的装置的结构示意图,如图14所示,本实施例提供的辅助模型切分的装置应用于网络实体。则本实施例提供的辅助模型切分的装置包括:收发机1400,用于在处理器1410的控制下接收和发送数据。
其中,在图14中,总线架构可以包括任意数量的互联的总线和桥,具体由处理器1410代表的一个或多个处理器和存储器1420代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口提供接口。收发机1400可以是多个元件,即包括发送机和接收机,提供用于在传输介质上与各种其他装置通信的单元,这些传输介质包括无线信道、有线信道、光缆等传输介质。处理器1410负责管理总线架构和通常的处理,存储器1420可以存储处理器140在执行操作时所使用的数据。
处理器1410可以是中央处埋器(CPU)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或复杂可编程逻辑器件(Comple6Programmable Logic Device,CPLD),处理器也可以采用多核架构。
本实施例中,存储器1420,用于存储计算机程序;收发机1400,用于在处理器1410的控制下收发数据;处理器1410,用于读取存储器中的计算机程序并执行以下操作:
接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
根据所述第一消息,确定AI/ML模型切分的切分结果。
可选地,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,处理器1410,用于根据所述第一消息,确定AI/ML模型切分的切分结果时,具体包括:
根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
将所述模型切分点信息作为所述切分结果。
可选地,处理器1410,用于确定所述待参与模型联合推理的UE(s)对应的模型切分点信息时,具体包括:
将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
可选地,处理器1410,用于在所述网络实体是网络数据分析功能NWDAF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时, 具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第三消息,所述第三消息中包括AI/ML模型切分点信息以及所述第一数据;所述第三消息用于为AF提供向待参与模型联合推理的UE(s)发送第二请求时携带的参数,所述第二请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第二请求中携带的参数包括所述第三消息。
可选地,处理器1410,用于在所述网络实体是网络数据分析功能NWDAF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;将所述第三请求中携带的参数作为所述第一消息;
其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第四消息,所述第四消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第四消息用于为AF提供向待参与模型联合推理的UE(s)发送的第四请求结果,所述第四请求结果中包括所述AI/ML模型切分点信息。
可选地,处理器1410,用于在所述网络实体是网络数据分析功能NWDAF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参数包括下述至 少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第六消息,所述第六消息中至少包括所述AI/ML模型切分点信息以及所述第二数据;所述第六消息用于为AF提供向待参与模型联合推理的UE(s)发送的第六请求结果,所述第六请求结果中包括所述待参与模型联合推理的UE(s)对应的所述AI/ML模型切分点信息、所述其他可参与模型推理的UE(s)以及对应的所述AI/ML模型切分点信息。
可选地,处理器1410,用于在所述网络实体是网络数据分析功能NWDAF 时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第七请求中携带的参数以及所述查找结果作为所述第一消息
其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述待参与模型联合推理的UE(s)发送第七请求结果,所述第七请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第七请求结果中还 包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE。
可选地,处理器1410,用于在所述网络实体是应用功能AF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,将确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力的结果、所述待参与模型联合推理的UE(s)的相关信息以及所述待参与模型联合推理的UE(s)对应的模型切分点信息发送给所述UE;
其中,所述相关信息包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述相关信息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,处理器1410,用于在所述网络实体是应用功能AF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理 AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十请求结果作为所述第一消息。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向待参与模型联合推理的UE(s)发送第十一请求,所述第十一请求用于请求与所述待参与模型联合推理的UE(s)执行模型联合推理操作;其中,所述第十一请求中携带的参数包括下述至少一项:模型切分点信息、所述第一消息;
接收待参与模型联合推理的UE(s)发送的第十一请求结果,所述第十一请求结果是由待参与模型联合推理的UE(s)根据第十一请求中携带的参数确定的,所述第十一请求结果包括接受第十一请求或不接受第十一请求。
可选地,处理器1410,用于在所述网络实体是新的网络实体MMF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分的数据;
接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、 AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十三请求结果作为所述第一消息。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第八消息,所述第八消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第八消息用于为AF提供向待参与模型联合推理的UE(s)发送第十四请求时携带的参数,所述第十四请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十四请求中携带的参数包括所述第八消息。
可选地,处理器1410,用于在所述网络实体是新的网络实体MMF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提 供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及所述第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第十五请求结果,所述第十五请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第十五请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述待参与模型联合推理的UE(s)。
可选地,处理器1410,用于在所述网络实体是策略控制功能PCF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十七请求中携带的参数,向网络数据分析功能NWDAF发送 第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
将所述第十八请求结果作为所述第一消息。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第九消息,所述第九消息中包括所述AI/ML模型切分点、所述第一消息的数据;所述第九消息用于为AF提供向待参与模型联合推理的UE(s)发送第十九请求时携带的参数,所述第十九请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十九请求中携带的参数包括所述第九消息。
可选地,处理器1410,用于在所述网络实体是策略控制功能PCF时,且接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息时,具体包括:
直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE是否具备完全支持执行模型联合推理的能力;
若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息
若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1410,还用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第二十请求结果,所述第二十请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第二十请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE。
在此需要说明的是,本公开提供的辅助模型切分的装置,能够实现图2-图12所示方法实施例所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
图15为本公开另一实施例提供的辅助模型切分的装置的结构示意图,如图15所示,本实施例提供的辅助模型切分的装置应用于网络实体,则本实施 例提供的辅助模型切分的装置1500包括:
接收单元1501,用于接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
确定单元1502,用于根据所述第一消息,确定AI/ML模型切分的切分结果。
可选地,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,确定单元,具体用于:
根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
将所述模型切分点信息作为所述切分结果。
可选地,确定单元,具体用于:
将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预 设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
可选地,接收单元,具体用于:
如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少 一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
可选地,辅助模型切分的装置,还包括:处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第三消息,所述第三消息中包括AI/ML模型切分点信息以及所述第一数据;所述第三消息用于为AF提供向待参与模型联合推理的UE(s)发送第二请求时携带的参数,所述第二请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第二请求中携带的参数包括所述第三消息。
可选地,所述接收单元,具体用于:
如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求 用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;将所述第三请求中携带的参数作为所述第一消息;
其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,辅助模型切分的装置,还包括:处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第四消息,所述第四消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第四消息用于为AF提供向待参与模型联合推理的UE(s)发送的第四请求结果,所述第四请求结果中包括所述AI/ML模型切分点信息。
可选地,接收单元,具体用于:
如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的 时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第六消息,所述第六消息中至少包括所述AI/ML模型切分点信息以及所述第二数据;所述第六消息用于为AF提供向待参与模型联合推理的UE(s)发送的第六请求结果,所述第六请求结果中包括所述待参与模型联合推理的UE(s)对应的所述AI/ML模型切分点信息、所述其他可参与模型推理的UE(s)以及对应的所述AI/ML模型切分点信息。
可选地,接收单元,具体用于:
如果所述网络实体是网络数据分析功能NWDAF,接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括: 推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第七请求中携带的参数以及所述查找结果作为所述第一消息;
其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述待参与模型联合推理的UE(s)发送第七请求结果,所述第七请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第七请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE。
可选地,接收单元,具体用于:
如果所述网络实体是应用功能AF,接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型 联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,将确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力的结果、所述待参与模型联合推理的UE(s)的相关信息以及所述待参与模型联合推理的UE(s)对应的模型切分点信息发送给所述UE;
其中,所述相关信息包括下述至少一项:待参与模型联合推理的UE(s) 可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述相关信息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
可选地,接收单元,具体用于:
如果所述网络实体是应用功能AF,向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十请求结果作为所述第一消息。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向待参与模型联合推理的UE(s)发送第十一请求,所述第十一请求用于请求与所述待参与模型联合推理的UE(s)执行模型联合推理操作;其中,所述第十一请求中携带的参数包括下述至少一项:模型切分点信息、所述第一消息;
接收待参与模型联合推理的UE(s)发送的第十一请求结果,所述第十一请求结果是由待参与模型联合推理的UE(s)根据第十一请求中携带的参数确定 的,所述第十一请求结果包括接受第十一请求或不接受第十一请求。
可选地,接收单元,具体用于:
如果所述网络实体是新的网络实体MMF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分的数据;
接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
将所述第十三请求结果作为所述第一消息。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第八消息,所述第八消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第八消息用于为AF提供向待参与模型联合推理的UE(s)发送第十四请求时携带的参数,所述第十四请求用于请求与待参与模型联合推理的UE(s)执行模型联合 推理操作,所述第十四请求中携带的参数包括所述第八消息。
可选地,接收单元,具体用于:
如果所述网络实体是新的网络实体MMF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及所述第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第十五请求结果,所述第十五请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第十五请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述待参与模型联合推理的UE(s)。
可选地,接收单元,具体用于:
如果所述网络实体是策略控制功能PCF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第十七请求中携带的参数,向网络数据分析功能NWDAF发送第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
将所述第十八请求结果作为所述第一消息。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第九消息,所述 第九消息中包括所述AI/ML模型切分点、所述第一消息的数据;所述第九消息用于为AF提供向待参与模型联合推理的UE(s)发送第十九请求时携带的参数,所述第十九请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十九请求中携带的参数包括所述第九消息。
可选地,接收单元,具体用于:如果所述网络实体是策略控制功能PCF,
直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE是否具备完全支持执行模型联合推理的能力;
若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他 可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息
若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,辅助模型切分的装置,还包括处理单元;处理单元,用于:
确定AI/ML模型切分的切分结果之后,向所述AF发送第二十请求结果,所述第二十请求结果中包括AI/ML模型切分点信息;
其中,若存在其他参与联合模型推理的UE(s),则所述第二十请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE。
在此需要说明的是,本公开提供的辅助模型切分的装置,能够实现图2-图12方法实施例所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
图16为本公开再一实施例提供的辅助模型切分的装置的结构示意图,如图16所示,本实施例提供的辅助模型切分的装置应用于用户设备UE。则本实施例提供的辅助模型切分的装置包括:收发机1600,用于在处理器1610的控制下接收和发送数据。
其中,在图16中,总线架构可以包括任意数量的互联的总线和桥,具体由处理器1610代表的一个或多个处理器和存储器1620代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口提供接口。收发机1600可以是多个元件,即包括发送机和接收机,提供用于在传输介质上与各种其他装置通信的单元,这些传输介质包括无线信道、有线信道、光缆等传输介质。处理器1610负责管理总线架构和通常的处理,存储器1620可以存储处理器1610在执行操作时所使用的数据。
处理器1610可以是中央处埋器(CPU)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或复杂可编程逻辑器件(Comple8Programmable Logic Device,CPLD),处理器也可以采用多核架构。
本实施例中,存储器1620,用于存储计算机程序;收发机1600,用于在处理器的控制下收发数据;处理器1610,用于读取存储器中的计算机程序并执行以下操作:
根据自身能力信息,确定AI/ML模型切分点信息;
将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
可选地,处理器1610,用于根据自身能力信息,确定AI/ML模型切分点信息时,具体包括:
根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述模型切分点信息。
可选地,处理器1610,用于确定模型切分点信息时,具体包括:
将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
可选地,处理器1610,还用于:
确定AI/ML模型切分的切分结果之后,向应用功能AF发送第一请求,所述第一请求用于请求与所述AF执行模型联合推理操作;其中,所述第一请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收AF发送的第一请求结果,所述第一请求结果是由AF根据第一请求中携带的参数确定的,所述第一请求结果包括接受第一请求或不接受第一请 求;
其中,若第一请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理器1610,还用于:
确定AI/ML模型切分的切分结果之后,通过AF向网络数据分析功能NWDAF发送第二请求,所述第二请求用于请求与所述NWDAF执行模型联合推理操作;其中,所述第二请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二请求结果,所述第二请求结果是由NWDAF根据第一请求中携带的参数确定的,所述第二请求结果包括接受第一请求或不接受第一请求;
其中,若第二请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
在此需要说明的是,本公开提供的辅助模型切分的装置,能够实现图13所示方法实施例所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
图17为本公开又一实施例提供的辅助模型切分的装置的结构示意图,如图17所示,本公开实施例提供的辅助模型切分的装置应用于用户设备UE,则本实施例提供的辅助模型切分的装置1700包括:
确定单元1701,用于根据自身能力信息,确定AI/ML模型切分点信息;
处理单元1702,用于将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
可选地,确定单元1701,具体用于:
根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量, 确定模型切分点信息;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定模型切分点信息。
可选地,确定单元1701,具体用于:
将所有确定参与执行模型联合推理的UE中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
可选地,处理单元,还用于:
确定AI/ML模型切分的切分结果之后,向应用功能AF发送第一请求,所述第一请求用于请求与所述AF执行模型联合推理操作;其中,所述第一请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收AF发送的第一请求结果,所述第一请求结果是由AF根据第一请求中携带的参数确定的,所述第一请求结果包括接受第一请求或不接受第一请求;
其中,若第一请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
可选地,处理单元,还用于:
确定AI/ML模型切分的切分结果之后,通过AF向网络数据分析功能NWDAF发送第二请求,所述第二请求用于请求与所述NWDAF执行模型联合推理操作;其中,所述第二请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML 模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
接收NWDAF发送的第二请求结果,所述第二请求结果是由NWDAF根据第一请求中携带的参数确定的,所述第二请求结果包括接受第一请求或不接受第一请求;
其中,若第二请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
在此需要说明的是,本公开提供的辅助模型切分的装置,能够实现图13方法实施例所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
需要说明的是,本公开实施例对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本公开各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本公开实施例还提供一种处理器可读存储介质。处理器可读存储介质存储有计算机程序,计算机程序用于使处理器执行上述任一种方法实施例。
其中,处理器可读存储介质可以是处理器能够存取的任何可用介质或数据存储设备,包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(MO)等)、光学存储器(例如CD、DVD、BD、HVD等)、以及半导体存储器(例如ROM、EPROM、EEPROM、非易失性存储器(NAND FLASH)、固态硬盘(SSD))等。
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机可执行指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机可执行指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些处理器可执行指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的处理器可读存储器中,使得存储在该处理器可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些处理器可执行指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本公开进行各种改动和变型而不脱离本公开的精神和范围。这样,倘若本公开的这些修改和变型属于本公开权利要求及其等同技术的范围之内,则本公开也意图包含这些改动和变型在内。

Claims (46)

  1. 一种辅助模型切分的方法,其特征在于,应用于网络实体,所述方法包括:
    接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
    根据所述第一消息,确定AI/ML模型切分的切分结果。
  2. 根据权利要求1所述的方法,其特征在于,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一消息,确定AI/ML模型切分的切分结果,包括:
    根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述第一消息中的待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定所述待参与模型联合推理的UE(s)对应的模型切分点信息;
    将所述模型切分点信息作为所述切分结果。
  4. 根据权利要求3所述的方法,其特征在于,所述确定所述待参与模型联合推理的UE(s)对应的模型切分点信息,包括:
    将所有确定参与执行模型联合推理的UE(s)中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
    若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电 量阈值时,确定所述目标UE执行第一预设数目层的推理;
    若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
    若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
    根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实 体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
  6. 根据权利要求5所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第三消息,所述第三消息中包括AI/ML模型切分点信息以及所述第一数据;所述第三消息用于为AF提供向待参与模型联合推理的UE(s)发送第二请求时携带的参数,所述第二请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第二请求中携带的参数包括所述第三消息。
  7. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实 体是网络数据分析功能NWDAF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;
    将所述第三请求中携带的参数作为所述第一消息;
    其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  8. 根据权利要求7所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第四消息,所述第四消息中包括所述AI/ML模型切分点信息、所述第一消息中的数据;所述第四消息用于为AF提供向待参与模型联合推理的UE(s)发送的第四请求结果,所述第四请求结果中包括所述AI/ML模型切分点信息。
  9. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求 确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
    根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
    其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  10. 根据权利要求9所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第六消息,所述第六消息中至少包括所述AI/ML模型切分点信息以及所述第二数据;所述第六消息用于为AF提供向待参与模型联合推理的UE(s)发送的第六请求结果,所述第六请求结果中包括所述待参与模 型联合推理的UE(s)以及对应的所述AI/ML模型切分点信息。
  11. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是网络数据分析功能NWDAF,所述接收用于请求人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第七请求中携带的参数以及所述查找结果作为所述第一消息;
    其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  12. 根据权利要求11所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述待参与模型联合推理的UE(s)发送第七请求结果,所述第七请求结果中包括AI/ML模型切分点信息;
    其中,若存在其他参与联合模型推理的UE(s),则所述第七请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE(s)。
  13. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他 可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  14. 根据权利要求13所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    将确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力的结果、所述待参与模型联合推理的UE(s)的相关信息以及所述待参与模型联合推理的UE(s)对应的模型切分点信息发送给所述UE;
    其中,所述相关信息包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述相关信息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
  15. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是应用功能AF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一 项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第十请求结果作为所述第一消息。
  16. 根据权利要求15所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向待参与模型联合推理的UE(s)发送第十一请求,所述第十一请求用于请求与所述待参与模型联合推理的UE(s)执行模型联合推理操作;其中,所述第十一请求中携带的参数包括下述至少一项:模型切分点信息、所述第一消息;
    接收待参与模型联合推理的UE(s)发送的第十一请求结果,所述第十一请求结果是由待参与模型联合推理的UE(s)根据第十一请求中携带的参数确定的,所述第十一请求结果包括接受第十一请求或不接受第十一请求。
  17. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是新的网络实体MMF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分 的数据;
    接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第十三请求结果作为所述第一消息。
  18. 根据权利要求17所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第八消息,所述第八消息中包括所述AI/ML模型切分点、所述第一消息中的数据;所述第八消息用于为AF提供向待参与模型联合推理的UE(s)发送第十四请求时携带的参数,所述第十四请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十四请求中携带的参数包括所述第八消息。
  19. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是新的网络实体MMF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及所述第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
  20. 根据权利要求19所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第十五请求结果,所述第十五请求结果中包括AI/ML模型切分点信息;
    其中,若存在其他参与联合模型推理的UE(s),则所述第十五请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述待参与模型联合推理的UE(s)。
  21. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接 收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第十七请求中携带的参数,向网络数据分析功能NWDAF发送第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
    将所述第十八请求结果作为所述第一消息。
  22. 根据权利要求21所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第九消息,所述第九消息中包括所述AI/ML模型切分点、所述第一消息的数据;所述第九消息用于为AF提供向待参与模型联合推理的UE(s)发送第十九请求时携带的参数,所述第十九请求用于请求与待参与模型联合推理的UE(s)执行模型联合推理操作,所述第十九请求中携带的参数包括所述第九消息。
  23. 根据权利要求1-4任一项所述的方法,其特征在于,如果所述网络实体是策略控制功能PCF,所述接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息,包括:
    直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参 数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
    若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  24. 根据权利要求23所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向所述AF发送第二十请求结果,所述第二十请求结果中包括AI/ML模 型切分点信息;
    其中,若存在其他参与联合模型推理的UE(s),则所述第二十请求结果中还包括其他参与联合模型推理的UE(s)对应的模型切分点信息,并通过AF透传给所述UE(s)。
  25. 一种辅助模型切分的方法,其特征在于,应用于用户设备UE,所述方法包括:
    根据自身能力信息,确定AI/ML模型切分点信息;
    将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
  26. 根据权利要求25所述的方法,其特征在于,所述根据自身能力信息,确定AI/ML模型切分点信息,包括:
    根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定模型切分点信息。
  27. 根据权利要求26所述的方法,其特征在于,所述确定模型切分点信息,包括:
    将所有确定参与执行模型联合推理的UE(s)中的任一UE作为目标UE,针对每个所述目标UE执行下述步骤:
    若所述目标UE可提供的算力低于第一预设算力阈值或所述目标UE可提供的内存低于第一预设内存阈值或所述目标UE剩余的电量低于第一预设电量阈值时,确定所述目标UE执行第一预设数目层的推理;
    若所述目标UE可提供的算力高于第一预设算力阈值并低于第二预设算力阈值,和所述目标UE可提供的内存高于第一预设内存阈值并低于第二预设内存阈值,以及所述目标UE剩余的电量高于第一预设电量阈值并低于第二预设电量阈值时,确定所述目标UE执行第二预设数目的推理,第一预设数目层小于第二预设数目层;
    若所述目标UE可提供的算力高于第二预设算力阈值并低于第三预设算力阈值,和所述目标UE可提供的内存高于第二预设内存阈值并低于第三预 设内存阈值,以及所述目标UE剩余的电量高于第二预设电量阈值并低于第三预设电量阈值时,确定所述目标UE执行第三预设数目层的推理,第二预设数目层小于第三预设数目层;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级低于第一预设隐私等级或模型不同层的时延需求信息低于第一预设时延,则所述目标UE执行所述第一预设数目层的推理;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第一预设隐私等级并低于第二预设隐私等级,和模型不同层的时延需求信息高于第一预设时延并低于第二预设时延时,确定所述目标UE执行第二预设数目层的推理;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,且若推理AI/ML模型需要的数据集的隐私等级高于第二预设隐私等级并低于第三预设隐私等级,和模型不同层的时延需求信息高于第二预设时延并低于第三预设时延时,确定所述目标UE执行第三预设数目层的推理;以此类推,直至确定所述目标UE执行第N+1预设数目层的推理,第N预设数目层小于第N+1预设数目层;
    根据预设数目层确定AI/ML模型切分点信息,所述AI/ML模型切分点信息用于表示AI/ML模型切分比例,所述预设数目层包括第N预设数目层,N大于或等于一。
  28. 根据权利要求27所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    向应用功能AF发送第一请求,所述第一请求用于请求与所述AF执行模型联合推理操作;其中,所述第一请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型 不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收AF发送的第一请求结果,所述第一请求结果是由AF根据第一请求中携带的参数确定的,所述第一请求结果包括接受第一请求或不接受第一请求;
    其中,若第一请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  29. 根据权利要求27所述的方法,其特征在于,所述确定AI/ML模型切分的切分结果之后,所述方法还包括:
    通过AF向网络数据分析功能NWDAF发送第二请求,所述第二请求用于请求与所述NWDAF执行模型联合推理操作;其中,所述第二请求中携带的参数包括下述至少一项AI/ML模型切分点、与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE可提供的内存、待参与模型联合推理的UE剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第二请求结果,所述第二请求结果是由NWDAF根据第一请求中携带的参数确定的,所述第二请求结果包括接受第一请求或不接受第一请求;
    其中,若第二请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  30. 一种辅助模型切分的装置,其特征在于,应用于网络实体,所述装置包括存储器,收发机,处理器:
    存储器,用于存储计算机程序;收发机,用于在所述处理器的控制下收发数据;处理器,用于读取所述存储器中的计算机程序并执行以下操作:
    接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
    根据所述第一消息,确定AI/ML模型切分的切分结果。
  31. 一种辅助模型切分的装置,其特征在于,应用于用户设备UE,所述装置包括存储器,收发机,处理器:
    存储器,用于存储计算机程序;收发机,用于在所述处理器的控制下收发数据;处理器,用于读取所述存储器中的计算机程序并执行以下操作:
    根据自身能力信息,确定AI/ML模型切分点信息;
    将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
  32. 一种辅助模型切分的装置,其特征在于,应用于网络实体,所述装置包括:
    接收单元,用于接收用于辅助人工智能/机器学习AI/ML模型切分分析的第一消息;
    确定单元,用于根据所述第一消息,确定AI/ML模型切分的切分结果。
  33. 根据权利要求32所述的装置,其特征在于,所述第一消息中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者用户永久标识SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第一消息中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
  34. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第一请求,所述第一请求用于请求对AI/ML模型切分进行分析;其中,所述第一请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型需要数据集的隐私等级和/或模型不同层的时延,所述第一请求中携带的参数还包括:推理AI/ML模型需要的数据 集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第一请求中携带的参数,向5GC NF(s)发送第二消息,所述第二消息用于请求5GC NF(s)采集UE(s)对应的第一数据,所述第一数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)以及SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第一请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第一数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收5GC NF(s)发送的所述第一数据,并将所述第一数据作为所述第一消息;所述第一数据是由待参与模型联合推理的UE(s)同意所述第二消息的请求后提供的。
  35. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收AF发送的第三请求;所述第三请求是由AF确定与待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第四请求中携带的参数确定的,所述第四请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第四请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第四请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,所述第三请求中携带的参数包括所述第四请求中携带的参数;
    将所述第三请求中携带的参数作为所述第一消息;
    其中,若第四请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  36. 根据权利要求32或33所述的装置,其特征在于,发送单元,具体用于:
    如果所述网络实体是网络数据分析功能NWDAF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第五请求,所述第五请求是由AF确定与待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力后确定的,其中,是否具备完全支持执行模型联合推理的能力是由AF根据接收到的待参与模型联合推理的UE(s)发送的第六请求确定的,所述第五请求用于请求对AI/ML模型切分进行分析以及请求查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量;其中,所述第五请求中携带的参数包括下述至少一项:与模型切分关联的分析类型标识、接受AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足切分条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;所述第六请求用于待参与模型联合推理的UE(s)请求与AF进行模型联合推理,所述第六请求中携带的参数包括所述第五请求中携带的参数;
    根据所述第五请求中携带的参数,向5GC NF(s)发送第五消息,所述第五消息用于请求5GC NF(s)采集UE(s)对应的第二数据,所述第二数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第五请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第二数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收5GC NF(s)发送的所述第二数据,并将所述第二数据作为所述第一消息;所述第二数据是由待参与模型联合推理的UE(s)同意所述第五消息的请求后提供的;
    其中,若第六请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  37. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是网络数据分析功能NWDAF,接收待参与模型联合推理的UE(s)发送的第七请求,所述第七请求用于请求与NWDAF执行模型联合推理操作;所述第七请求中携带的参数包括下述至少一项:待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第七请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第七请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则请求NF查找其他可参与模型推理的UE(s)并提供其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,且若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则请求提供其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NF发送的查找结果,所述查找结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则查找结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第七请求中携带的参数以及所述查找结果作为所述第一消息;
    其中,若第七请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  38. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体 用于:
    如果所述网络实体是应用功能AF,接收待参与模型联合推理的UE(s)发送的第八请求,所述第八请求用于请求与AF执行模型联合推理操作;所述第八请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第八请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;根据所述第八请求中携带的参数,确定与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第八请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向网络数据分析功能NWDAF发送第九请求,所述第九请求中携带的参数包括所述第八请求中携带的参数,且所述第九请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第九请求结果,并将所述第九请求结果以及所述第八请求中携带的参数作为所述第一消息;其中,所述第九请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第九请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    若第八请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  39. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体 用于:
    如果所述网络实体是应用功能AF,向网络数据分析功能NWDAF发送第十请求,所述第十请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的用于分析AI/ML模型切分的数据,所述第十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十请求结果,所述第十请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第十请求结果作为所述第一消息。
  40. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是新的网络实体MMF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十二请求,所述第十二请求用于请求AI/ML模型切分分析;所述第十二请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十二请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私 等级和/或模型不同层的时延需求信息;
    根据所述第十二请求中携带的参数,向NWDAF发送第十三请求,所述第十三请求中携带的参数包括所述第十二请求中携带的参数,且所述第十三请求用于请求NWDAF向5GC NF(s)采集UE(s)的用于分析AI/ML模型切分的数据;
    接收NWDAF发送的第十三请求结果;其中,所述第十三请求结果包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、使用AI/ML模型的应用的标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若第十三请求中携带的参数包括推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,则所述第十二请求结果中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    将所述第十三请求结果作为所述第一消息。
  41. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是新的网络实体MMF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十五请求,所述第十五请求用于请求AI/ML模型切分分析,所述第十五请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十五请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第十五请求中携带的参数,确定AF与所述待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若与所述待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第十五请求中携带的参数作为所述第一消息;
    若与所述待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第十六请求,所述第十六请求中携带的参数包 括所述第十五请求中携带的参数,且所述第十六请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十六请求结果,并将所述第十六请求结果以及所述第十五请求中携带的参数作为所述第一消息;其中,所述第十六请求结果包括其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十六请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息。
  42. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是策略控制功能PCF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第十七请求,所述第十七请求用于请求AI/ML模型切分分析;所述第十七请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、接收AI/ML模型切分的一个用户设备UE或一组UEs的标识或满足分析条件的任意UEs、AI/ML模型切分的区域、AI/ML模型的大小、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第十七请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第十七请求中携带的参数,向网络数据分析功能NWDAF发送第十八请求,所述第十八请求中携带的参数包括所述第十七请求中携带的参数,且所述第十八请求用于请求NWDAF向5GC NF(s)采集UE(s)对应的第五数据;所述第五数据中包括下述至少一项:待参与AI/ML模型切分的UE(s)或者SUPI、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推 理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量、AI/ML模型的大小;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第五数据中还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第十八请求结果;其中,所述第十八请求结果包括所述第五数据;
    将所述第十八请求结果作为所述第一消息。
  43. 根据权利要求32或33所述的装置,其特征在于,接收单元,具体用于:
    如果所述网络实体是策略控制功能PCF,直接或通过网络能力开放功能NEF接收应用功能AF发送的第二十请求,所述第二十请求用于请求AI/ML模型切分分析,所述第二十请求中携带的参数包括下述至少一项:与模型标识或模型切分标识关联的分析类型标识、待参与模型联合推理的UE(s)可提供的算力、待参与模型联合推理的UE(s)可提供的内存、待参与模型联合推理的UE(s)剩余的电量;若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十请求中携带的参数还包括:推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    根据所述第二十请求中携带的参数,确定所述AF以及待参与模型联合推理的UE(s)是否具备完全支持执行模型联合推理的能力;
    若所述AF以及待参与模型联合推理的UE(s)具备完全支持执行模型联合推理的能力,则将所述第二十请求中携带的参数作为所述第一消息;
    若所述AF以及待参与模型联合推理的UE(s)不具备完全支持执行模型联合推理的能力,则向NWDAF发送第二十一请求,所述第二十一请求中携带的参数包括所述第二十请求中携带的参数,且所述第二十一请求用于请求NWDAF通过NF查找其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求中携带的参数还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    接收NWDAF发送的第二十一请求结果,并将所述第二十一请求结果以 及所述第二十请求中携带的参数作为所述第一消息;其中,所述第二十一请求结果包括:其他可参与模型推理的UE(s)以及其他可参与模型推理的UE(s)可提供的算力、可提供的内存、剩余的电量,若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,所述第二十一请求结果还包括:其他可参与模型推理的UE(s)推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息;
    若第二十请求中携带的参数中不包含模型不同层的时延需求信息,则模型不同层的时延需求信息由AF提供。
  44. 一种辅助模型切分的装置,其特征在于,应用于用户设备UE,所述装置包括:
    确定单元,用于根据自身能力信息,确定AI/ML模型切分点信息;
    处理单元,用于将所述AI/ML模型切分点信息作为AI/ML模型切分的切分结果。
  45. 根据权利要求44所述的装置,其特征在于,确定单元,具体用于:
    根据自身能力信息中的可提供的算力、可提供的内存以及剩余的电量,确定模型切分点信息;
    若推理AI/ML模型时需要数据集的隐私等级和/或模型不同层的时延,则根据所述自身能力信息中的可提供的算力、可提供的内存、剩余的电量、推理AI/ML模型需要的数据集的隐私等级和/或模型不同层的时延需求信息,确定模型切分点信息。
  46. 一种处理器可读存储介质,其特征在于,所述处理器可读存储介质存储有计算机程序,所述计算机程序用于使所述处理器执行权利要求1至29任一项所述的方法。
PCT/CN2022/131852 2021-12-23 2022-11-15 辅助模型切分的方法、装置及可读存储介质 WO2023116259A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111608193.5 2021-12-23
CN202111608193.5A CN116341673A (zh) 2021-12-23 2021-12-23 辅助模型切分的方法、装置及可读存储介质

Publications (1)

Publication Number Publication Date
WO2023116259A1 true WO2023116259A1 (zh) 2023-06-29

Family

ID=86879558

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131852 WO2023116259A1 (zh) 2021-12-23 2022-11-15 辅助模型切分的方法、装置及可读存储介质

Country Status (2)

Country Link
CN (1) CN116341673A (zh)
WO (1) WO2023116259A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287609A (zh) * 2020-12-28 2021-01-29 之江实验室 一种面向机器人任务划分的端、边、云协同计算装置
WO2021142637A1 (zh) * 2020-01-14 2021-07-22 Oppo广东移动通信有限公司 人工智能操作处理方法、装置、系统、终端及网络设备
WO2021163895A1 (zh) * 2020-02-18 2021-08-26 Oppo广东移动通信有限公司 网络模型的管理方法及建立或修改会话的方法、装置
CN113436208A (zh) * 2021-06-30 2021-09-24 中国工商银行股份有限公司 基于端边云协同的图像处理方法、装置、设备及介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021142637A1 (zh) * 2020-01-14 2021-07-22 Oppo广东移动通信有限公司 人工智能操作处理方法、装置、系统、终端及网络设备
WO2021163895A1 (zh) * 2020-02-18 2021-08-26 Oppo广东移动通信有限公司 网络模型的管理方法及建立或修改会话的方法、装置
CN112287609A (zh) * 2020-12-28 2021-01-29 之江实验室 一种面向机器人任务划分的端、边、云协同计算装置
CN113436208A (zh) * 2021-06-30 2021-09-24 中国工商银行股份有限公司 基于端边云协同的图像处理方法、装置、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "5G Standard Study Notes (3GPP TS 23.501)", 15 November 2020 (2020-11-15), pages 1 - 14, XP093073964, Retrieved from the Internet <URL:https://blog.csdn.net/zzh123666/article/details/109706670> [retrieved on 20230816] *

Also Published As

Publication number Publication date
CN116341673A (zh) 2023-06-27

Similar Documents

Publication Publication Date Title
Haghi Kashani et al. Quality of service‐aware approaches in fog computing
US20220141111A1 (en) Mobility network slice selection
US20230318945A1 (en) Network Embedded Framework for Distributed Network Analytics
US20240235925A1 (en) Intelligent network
Neghabi et al. Load balancing mechanisms in the software defined networks: a systematic and comprehensive review of the literature
Kar et al. Offloading using traditional optimization and machine learning in federated cloud–edge–fog systems: A survey
US9148381B2 (en) Cloud computing enhanced gateway for communication networks
US9838483B2 (en) Methods, systems, and computer readable media for a network function virtualization information concentrator
US10942786B2 (en) Network management
CN105940377B (zh) 用于基于云的虚拟化编排器的方法、系统和计算机可读介质
Wang et al. Towards network-aware service composition in the cloud
CN108616912A (zh) 一种网络质量优化方法及装置
US10698863B2 (en) Method and apparatus for clearing data in cloud storage system
WO2019206100A1 (zh) 一种特征工程编排方法及装置
Zhong et al. Prediction-based dual-weight switch migration scheme for SDN load balancing
US20190349436A1 (en) Methods, apparatus and systems for resuming transmission link
Liu et al. Mix‐flow scheduling using deep reinforcement learning for software‐defined data‐center networks
Sideris et al. Seer: Empowering software defined networking with data analytics
US20200260344A1 (en) Procedures for interaction between the radio controller and the subordinated base station
WO2023116259A1 (zh) 辅助模型切分的方法、装置及可读存储介质
WO2021043066A1 (zh) 一种多管理域的通信方法和装置
Montazerolghaem Softwarization and virtualization of VoIP networks
Rui et al. Multiservice reliability evaluation algorithm considering network congestion and regional failure based on petri net
CN110519109B (zh) 探测节点关联的方法、装置、计算设备以及介质
Zhai et al. A migration method for service function chain based on failure prediction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22909576

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE