WO2024032214A1 - 一种推理方法及相关装置 - Google Patents

一种推理方法及相关装置 Download PDF

Info

Publication number
WO2024032214A1
WO2024032214A1 PCT/CN2023/103784 CN2023103784W WO2024032214A1 WO 2024032214 A1 WO2024032214 A1 WO 2024032214A1 CN 2023103784 W CN2023103784 W CN 2023103784W WO 2024032214 A1 WO2024032214 A1 WO 2024032214A1
Authority
WO
WIPO (PCT)
Prior art keywords
federated learning
features
feature
local
client
Prior art date
Application number
PCT/CN2023/103784
Other languages
English (en)
French (fr)
Inventor
邵云峰
吴骏
郑青
卢嘉勋
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024032214A1 publication Critical patent/WO2024032214A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the present application relates to the field of computer technology, and in particular, to a reasoning method and related devices.
  • Federated learning is a distributed machine learning paradigm in which multiple parties use their own data to collaboratively train artificial intelligence models without aggregating original data from multiple parties (different organizations or users).
  • the traditional machine learning paradigm requires the collection of large amounts of raw data for model training, and the raw data used for training is likely to come from multiple different organizations or users. Bringing together the raw data of multiple different organizations or different users is very likely to cause the risk of data leakage. For organizations, it will expose information assets, and for individual users, it may leak personal privacy.
  • the existence of the above problems poses severe challenges to the training of artificial intelligence models. To solve the above problems, federated learning technology emerged as the times require.
  • Federated learning allows multi-party original data to be retained locally without multi-party data aggregation, and multiple parties jointly train artificial intelligence models through collaborative computing (safe) and interactive intermediate calculation results.
  • multi-party user data is protected, and multi-party data can be fully utilized to collaboratively train models, thereby obtaining a more powerful model.
  • Typical federated learning can be divided into three paradigms according to scenarios: horizontal federation, vertical federation, and federated migration (federated domain adaptation).
  • horizontal federation horizontal federation
  • vertical federation vertical federation
  • federated migration federated domain adaptation
  • a horizontal federated learning architecture there is usually a server and different clients participating in the horizontal federation.
  • the client uses local data for training and uploads the trained model to the server; the server performs a weighted average of the models uploaded by all clients to obtain the global model.
  • the global model will be delivered to the client and used for client inference.
  • the client performs inference based on the global model issued by the server, and manages the resources of the device based on the inference results.
  • the inference results of the client are not accurate enough, which in turn causes the device to fail. Resource management is chaotic.
  • This application provides an inference method and related devices.
  • the method utilizes data information from other clients in addition to local data during the inference process to improve the accuracy of the inference results.
  • the first aspect of this application provides an inference method.
  • the method includes: a first federated learning client sends a first local feature to the federated learning server, where the first local feature is extracted from the first network data.
  • the first network data is data related to the target network resource obtained by the first federated learning client in the first time slot, and the target network resource is a network resource managed by the first federated learning client.
  • the first federated learning client receives global prior features from the federated learning server. The global prior features are obtained based on the first local features and the second local features.
  • the second local features is provided by the second federated learning client; the first federated learning client performs inference based on the global prior features and the second network data to obtain the inference result, and the second network data is the first federated Learning the data related to the target network resource obtained by the client in the second time slot, and the inference result is used to manage the target network resource, wherein the second time slot is the same as the first time slot or after the first time slot .
  • the first federated learning client collects the first network data related to the target network resource in the first time slot, and extracts the first local feature of the first network data and sends it to the federated learning server, and the federated learning server Collect local features uploaded by multiple clients to calculate global prior features and send them to each client.
  • the first federated learning client can reason about the second network data in the second time slot based on the global prior features.
  • the inference process uses information from other client data in addition to local data to improve the accuracy of the inference results.
  • the first network data is the sampling value of the target network resource-related data in the first time slot or the statistical value from the third time slot to the first time slot
  • the second network data is the target network resource sampling value.
  • the sample value of resource-related data in the second time slot, and the third time slot is before the first time slot.
  • the global prior feature is a feature vector or a first machine learning model
  • the first machine learning model is used for the first 2. Inference on network data.
  • the first federated learning client when the global prior features are feature vectors, performs inference based on the global prior features and the second network data to obtain the inference results including: the first federated learning client The end performs inference based on the global prior features, the second network data and the local second machine learning model to obtain the inference result, and the second machine learning model is used for inference on the second network data.
  • the first federated learning client needs to input the global prior features and the second network data into the local second machine learning model for inference, and use the output result of the second machine learning model as the inference result.
  • using a trainable second machine learning model for inference can improve the accuracy of the solution.
  • the first federated learning client performs inference based on the global prior features, the second network data and the local second machine learning model to obtain the inference result including: the first federated learning client converts the second The network data is input into the third learning model to obtain multiple features of the second network data output by the third learning model; the first federated learning client inputs the global prior features into the second machine learning model to obtain the third The first federated learning client determines the inference result based on the multiple features of the second network data and the respective weights of the multiple features of the second network data.
  • the first federated learning client before inputting the second network data into the second machine learning model, can also input the second network data into the third machine learning model to obtain the second network data that can reflect the second machine learning model. Multiple features of network data characteristics can save computing resources.
  • the first federated learning client inputs the global prior features and the second network data into the local second machine learning model for inference.
  • the first federated learning client inputs the global prior features into the second machine.
  • the weights corresponding to multiple features of the second network data output by the second machine learning model can be obtained, that is, each feature corresponds to a weight, and then the weights corresponding to the multiple features of the second network data and the second network data can be obtained.
  • the respective weights of multiple features determine the inference results, and different features correspond to different weights to improve the accuracy of inference.
  • the second machine learning model includes multiple first task models; the first federated learning client performs inference based on global prior features, second network data and the local second machine learning model to obtain
  • the inference results include: the first federated learning client calculates the respective weights of multiple first task models based on global prior features; the first federated learning client inputs the characteristics of the second network data into multiple first task models, To obtain the inference features output by multiple first task models; the first federated learning client obtains the inference results based on the respective weights of the multiple first task models and the inference features output by the multiple first task models.
  • the second machine learning model includes multiple first task models, which can process multiple features of the second network data respectively.
  • the respective weights of multiple first task models can be determined from the global a priori features, and the first federated learning client can input the characteristics of the second network data into multiple first task models, and the first task models have The input feature can be a default type of feature, that is, the features of the second network data are classified according to categories and input to the first task model of the corresponding category.
  • the first federated learning client can then perform inference features output by multiple first task models.
  • the weights corresponding to multiple first task models are combined for a weighted average to obtain the inference result. Training different types of features in different first task models, and then weighting the average based on weights, can improve the accuracy of inference.
  • the first federated learning client calculates the respective weights of multiple first task models based on global prior features, including: the first federated learning client calculates based on global prior features and the second network data. The respective weights of multiple first task models.
  • the first federated learning client calculates the weight of the first task model based on the client's local data and global prior features, and uses the second network data with real-time properties to improve the accuracy of the weight.
  • the method further includes: the first federated learning client extracts features of the second network data through a third machine learning model.
  • the characteristics of the second network data are used for inference to reduce the amount of inference calculations.
  • the third machine learning model includes a plurality of second task models; the first federated learning client extracts features of the second network data through the third machine learning model including: the first federated learning client extracts features of the second network data according to the third machine learning model.
  • the second network data determines respective weights of the plurality of second task models; the first federated learning client inputs the second network data into the plurality of second task models to obtain the second network data output by the plurality of second task models.
  • Sub-features obtain the characteristics of the second network data based on respective weights of the plurality of second task models and the sub-features of the second network data output by the plurality of second task models.
  • the first federated learning client first determines multiple second tasks based on local data or second network data.
  • the weight corresponding to the model and then input the second network data into the second task model to obtain the sub-features of the second network data output by the second task model.
  • the first federated learning client can use multiple second task models based on The corresponding weights and the sub-features output by the second task model are weighted and averaged to obtain the characteristics of the second network data. Different types of features are trained in different first task models, and then based on the weighted average, the reasoning can be improved. accuracy.
  • each second task model is a layer of autoencoders
  • the reconstruction target of the r-th task model among the multiple second task models is the residual of the r-1th task model, where , r is an integer greater than 1 and represents the number of second task models.
  • the first federated learning client when the global prior features are the first machine learning model, performs inference based on the global prior features and the second network data to obtain the inference results including: first The federated learning client extracts features of the second network data; the first federated learning client inputs the features of the second network data into the first machine learning model to obtain inference results output by the first machine learning model.
  • the federated learning server can directly deliver the first machine learning model as the inference model to improve the flexibility of the solution.
  • the first federated learning client when the global prior features are the first machine learning model, performs inference based on the global prior features and the second network data to obtain the inference results including: first The federated learning client uses sample data to train the first machine learning model; the first federated learning client extracts features of the second network data; the first federated learning client inputs the features of the second network data into the first trained model In the machine learning model, to obtain the inference result output by the first trained machine learning model.
  • the client when the federated learning server directly delivers the first machine learning model as the inference model, the client can further train based on local sample data to improve the accuracy of the inference model.
  • the method further includes: the first federated learning client sending grouping information to the federated learning server, the grouping information indicating the group in which the first local feature is located, so that the federated learning server can , the group where the first local feature is located, the second local feature and the group where the second local feature is located are used to obtain the global prior features.
  • the first federated learning client can also send grouping information to the server.
  • the grouping information indicates the group where the local feature is located, so that the server can obtain the result based on the local features from multiple clients and the group where the local feature is located.
  • Global prior features avoid all client data from determining the global prior features that will affect the output between groups.
  • the method further includes: the first federated learning client receives task synchronization information from the federated learning server, the task synchronization information is used to indicate the first time slot; the first federated learning client receives task synchronization information according to the task synchronization information
  • the first network data is selected from the local data.
  • the federated learning server instructs the time slot to which the local features uploaded by the client belong, so as to improve the synchronization of the data uploaded by the client.
  • the second aspect of this application provides an inference method, which method includes: the federated learning server receives the first local feature from the first federated learning client, the first local feature is extracted from the first network data, the first The network data is the data related to the target network resource obtained by the first federated learning client in the first time slot, and the target network resource is the network resource managed by the first federated learning client; the federated learning server determines the data based on the first local feature and the first time slot.
  • the second local feature is provided by the second federated learning client; the federated learning server sends the global prior feature to the first federated learning client, so that the first federated learning client can Perform inference on the prior features and the second network data to obtain the inference result.
  • the second network data is the data related to the target network resource obtained by the first federated learning client in the second time slot.
  • the inference result is used to manage the target network resource. , wherein the second time slot is the same as or after the first time slot.
  • the first network data is the sampling value of the target network resource-related data in the first time slot or the statistical value from the third time slot to the first time slot
  • the second network data is the target network resource sampling value.
  • the sample value of resource-related data in the second time slot, and the third time slot is before the first time slot.
  • the global prior feature is a feature vector or a first machine learning model
  • the first machine learning model is used for inference of the second network data.
  • the method further includes: the federated learning server receiving grouping information from the first federated learning client, where the grouping information from the first federated learning client indicates the grouping in which the first local feature is located; the federated learning server According to the first local feature and the second local feature.
  • Obtaining the global prior feature includes: the federated learning server obtains the global prior feature based on the first local feature, the group where the first local feature is located, the second local feature and the group where the second local feature is located. The grouping in which the two local features are located is indicated by the grouping information from the second federated learning client.
  • the first local feature includes a first sub-feature and a second sub-feature
  • the second local feature includes a third sub-feature and a fourth sub-feature
  • the grouping information from the first federated learning client indicates the first The group where the sub-feature is located and the group where the second sub-feature is located
  • the group information from the second federated learning client indicates the group where the third sub-feature is located and the group where the fourth sub-feature is located
  • the group where the first sub-feature is located and the group where the fourth sub-feature is located.
  • the three sub-features are in the same group; the federated learning server obtains the global prior features based on the first local feature, the group where the first local feature is located, the second local feature and the group where the second local feature is located, including: based on the first sub-feature
  • the group in which it is located is the same as the group in which the third sub-feature is located.
  • the federated learning server processes the first sub-feature and the third sub-feature to obtain the intermediate feature; the federated learning server uses the intermediate feature, the second sub-feature, and the fourth sub-feature.
  • Features, the group where the second sub-feature is located, and the group where the fourth sub-feature is located, the global prior features are obtained.
  • the first local feature uploaded by the first federated learning client includes the first sub-feature and the second sub-feature
  • the second local feature uploaded by the second federated learning client includes the third sub-feature and the fourth sub-feature.
  • the group information from the first federated learning client can indicate the group where the first sub-feature is located and the group where the second sub-feature is located
  • the group information from the second federated learning client indicates the group where the third sub-feature is located and the group where the fourth sub-feature is located.
  • the group of sub-features is described by the group of sub-features.
  • the federated learning server can first process the first sub-feature and the third sub-feature to obtain the intermediate feature, and then follow the Grouping, the processing result obtained after processing the intermediate feature and the second sub-feature is processed with the fourth sub-feature to obtain the global prior feature.
  • Characteristics uploaded by different types of clients can be processed separately to reduce the impact between different types of clients.
  • the federated learning server obtains the global a priori features based on the first local feature and the second local feature, including: the federated learning server based on the first local feature, the second local feature, from the first federated learning
  • the historical local features of the client and the historical local features from the second federated learning client are used to obtain the global prior features.
  • the federated learning server determines the global a priori features, in addition to the first local features and the second local features, it can also combine the historical local features from the first federated learning client and the historical local features from the second federated learning client. Determine historical local features to improve the accuracy of global prior features.
  • the federated learning server obtains the global prior features based on the first local features, the second local features, the historical local features from the first federated learning client, and the historical local features from the second federated learning client. It includes: the federated learning server calculates the similarity between the local features of the current reasoning process and multiple sets of historical local features.
  • the local features of the current reasoning process include the first local feature and the second local feature.
  • Each set of historical local features includes a historical reasoning process.
  • the corresponding historical prior features are weighted and summed to obtain the global prior features.
  • the federated learning server can calculate the similarity between the local features of the current reasoning process and multiple groups of historical local features, that is, determine the first local feature and each group of historical local features from the first federated learning client.
  • the similarity of historical local features, and determining the similarity between the second local feature and the historical local features from the second federated learning client in each group of historical local features, and then based on the similarity, historical priors corresponding to multiple groups of historical local features The features are weighted and summed to obtain the required global prior features and improve the accuracy of the global prior features.
  • multiple groups of historical local features have labels, and the labels are the actual results of each group of historical local features manually labeled; the method also includes: the federated learning server receives the inference results from the first federated learning client. ; The federated learning server determines the target inference result based on the inference results of the first federated learning client and the inference results of the second federated learning client; when the similarity between the local features of the current inference process and the historical local features of the target group is greater than or equal to the threshold Next, the federated learning server updates the historical local features of the target group based on the similarity between the local features of the current reasoning process and the historical local features of the target group.
  • the historical local features of the target group are among multiple groups of historical local features, and the label is the target inference result;
  • the federated learning server adds a set of historical local features based on the multiple sets of historical local features.
  • the added set of historical local features is Local characteristics of the current reasoning process.
  • multiple groups of historical local features may also have labels, which labels are artificially labeled for each group of historical local features.
  • the actual results of the characterization can be compared with the inference results to determine the accuracy of the inference results.
  • the client can upload them to the federated learning server.
  • the federated learning server can determine the target inference results corresponding to the current global prior features based on the inference results of multiple clients, and can select from multiple sets of historical local features.
  • the sample library can be updated and supplemented at any time to improve the effectiveness of the sample library.
  • the method further includes: the federated learning server sends task synchronization information to the first federated learning client, where the task synchronization information is used to indicate the first time slot, so that the first federated learning client The synchronization information selects the first network data from the local data.
  • the third aspect of this application provides an inference system.
  • the system includes: the first federated learning client sends the first partial feature to the federated learning server.
  • the first partial feature is extracted from the first network data.
  • the first The network data is the data related to the target network resource obtained by the first federated learning client in the first time slot, and the target network resource is the network resource managed by the first federated learning client; the federated learning server determines the data based on the first local feature and the first time slot.
  • Two local features are used to obtain global prior features.
  • the second local feature is provided by the second federated learning client; the federated learning server sends global prior features to the first federated learning client; the first federated learning client receives data from the federated learning client.
  • the second network data is obtained by the first federated learning client in the second time slot.
  • Data related to the target network resource, the inference result is used to manage the target network resource, wherein the second time slot is the same as the first time slot or after the first time slot.
  • the first network data is the sampling value of the target network resource-related data in the first time slot or the statistical value from the third time slot to the first time slot
  • the second network data is the target network resource sampling value.
  • the sample value of resource-related data in the second time slot, and the third time slot is before the first time slot.
  • the global prior feature is a feature vector or a first machine learning model
  • the first machine learning model is used for inference of the second network data.
  • the first federated learning client when the global prior features are feature vectors, performs inference based on the global prior features and the second network data to obtain the inference results including: the first federated learning client The end performs inference based on the global prior features, the second network data and the local second machine learning model to obtain the inference result, and the second machine learning model is used for inference on the second network data.
  • the first federated learning client performs inference based on the global prior features, the second network data and the local second machine learning model to obtain the inference result including: the first federated learning client converts the second The network data is input into the third learning model to obtain multiple features of the second network data output by the third learning model; the first federated learning client inputs the global prior features into the second machine learning model to obtain the third The first federated learning client determines the inference result based on the multiple features of the second network data and the respective weights of the multiple features of the second network data.
  • the second machine learning model includes multiple first task models; the first federated learning client performs inference based on global prior features, second network data and the local second machine learning model to obtain
  • the inference results include: the first federated learning client calculates the respective weights of multiple first task models based on global prior features; the first federated learning client inputs the characteristics of the second network data into multiple first task models, To obtain the inference features output by multiple first task models; the first federated learning client obtains the inference results based on the respective weights of the multiple first task models and the inference features output by the multiple first task models.
  • the first federated learning client calculates the respective weights of multiple first task models based on global prior features, including: the first federated learning client calculates based on global prior features and the second network data. The respective weights of multiple first task models.
  • the system further includes: the first federated learning client extracts features of the second network data through a third machine learning model.
  • the third machine learning model includes a plurality of second task models; the first federated learning client extracts features of the second network data through the third machine learning model including: the first federated learning client extracts features of the second network data according to the third machine learning model.
  • the second network data determines the respective weights of the multiple second task models; the first federated learning client inputs the second network data into the multiple second task models to obtain the second network data output by the multiple second task models.
  • Sub-features obtain the characteristics of the second network data according to the respective weights of the plurality of second task models and the sub-features of the second network data output by the plurality of second task models.
  • each second task model is a layer of autoencoders
  • the reconstruction target of the r-th task model among the multiple second task models is the residual of the r-1th task model, where , r is an integer greater than 1 and represents the number of second task models.
  • the first federated learning client when the global prior features are the first machine learning model, performs inference based on the global prior features and the second network data to obtain the inference results including: first The federated learning client extracts features of the second network data; the first federated learning client inputs the features of the second network data into the first machine learning model to obtain inference results output by the first machine learning model.
  • the first federated learning client when the global prior features are the first machine learning model, performs inference based on the global prior features and the second network data to obtain the inference results including: first The federated learning client uses sample data to train the first machine learning model; the first federated learning client extracts features of the second network data; the first federated learning client inputs the features of the second network data into the first trained model In the machine learning model, to obtain the inference result output by the first trained machine learning model.
  • the system further includes: the first federated learning client sends grouping information to the federated learning server, and the grouping information of the first federated learning client indicates the group in which the first local feature is located; the federated learning server receives from The grouping information of the first federated learning client; the federated learning server obtains the global prior features based on the first local feature and the second local feature, including: the federated learning server based on the first local feature, the group where the first local feature is located, and the The global prior features are obtained from the two local features and the group in which the second local feature is located.
  • the group in which the second local feature is located is indicated by the group information from the second federated learning client.
  • the first local feature includes a first sub-feature and a second sub-feature
  • the second local feature includes a third sub-feature and a fourth sub-feature
  • the grouping information from the first federated learning client indicates the first The group where the sub-feature is located and the group where the second sub-feature is located
  • the group information from the second federated learning client indicates the group where the third sub-feature is located and the group where the fourth sub-feature is located
  • the group where the first sub-feature is located and the group where the fourth sub-feature is located.
  • the three sub-features are in the same group; the federated learning server obtains the global prior features based on the first local feature, the group where the first local feature is located, the second local feature and the group where the second local feature is located, including: based on the first sub-feature
  • the group in which it is located is the same as the group in which the third sub-feature is located.
  • the federated learning server processes the first sub-feature and the third sub-feature to obtain the intermediate feature; the federated learning server uses the intermediate feature, the second sub-feature, and the fourth sub-feature.
  • Features, the group where the second sub-feature is located, and the group where the fourth sub-feature is located, the global prior features are obtained.
  • the federated learning server obtains the global a priori features based on the first local feature and the second local feature, including: the federated learning server based on the first local feature, the second local feature, from the first federated learning
  • the historical local features of the client and the historical local features from the second federated learning client are used to obtain the global prior features.
  • the federated learning server obtains the global prior features based on the first local features, the second local features, the historical local features from the first federated learning client, and the historical local features from the second federated learning client. It includes: the federated learning server calculates the similarity between the local features of the current reasoning process and multiple sets of historical local features.
  • the local features of the current reasoning process include the first local feature and the second local feature.
  • Each set of historical local features includes a historical reasoning process.
  • the corresponding historical prior features are weighted and summed to obtain the global prior features.
  • multiple groups of historical local features have labels, and the labels are the actual results of each group of historical local features manually labeled; the system also includes: the first federated learning client sends the first federated learning server to the federated learning server.
  • the historical local features of the target group are among multiple groups of historical local features, and the label is the target inference result; when the similarity between the local features of the current reasoning process and the historical local features of the target group is less than the threshold, the federated learning server adds a set of historical local features based on the multiple sets of historical local features, and the added one
  • the group historical local features are the local features of the current reasoning process.
  • the system further includes: the federated learning server sends task synchronization information to the first federated learning client, where the task synchronization information is used to indicate the first time slot; the first federated learning client sends the task synchronization information from The first network data is selected from the local data.
  • the fourth aspect of this application provides an inference device that can implement the method in the above first aspect or any possible implementation of the first aspect.
  • the device includes corresponding units or modules for performing the above method.
  • the units or modules included in the device can be implemented by software and/or hardware.
  • the device may be, for example, a network device, a chip, a chip system, a processor, etc. that supports the network device to implement the above method, or a logic module or software that can realize all or part of the network device functions.
  • the fifth aspect of this application provides an inference device that can implement the method in the above second aspect or any possible implementation of the second aspect.
  • the device includes corresponding units or modules for performing the above method.
  • the units or modules included in the device can be implemented by software and/or hardware.
  • the device may be, for example, a network device, a chip, a chip system, a processor, etc. that supports the network device to implement the above method, or a logic module or software that can realize all or part of the network device functions.
  • a sixth aspect of the present application provides a computer device, including: a processor, the processor is coupled to a memory, and the memory is used to store instructions. When the instructions are executed by the processor, the computer device implements the first aspect or the third aspect.
  • the method in any possible implementation in one aspect.
  • the computer device may be, for example, a network device, or may be a chip or chip system that supports the network device to implement the above method.
  • a seventh aspect of the present application provides a computer device, including: a processor, the processor is coupled to a memory, and the memory is used to store instructions. When the instructions are executed by the processor, the computer device implements the above second aspect or the third aspect. Methods in any possible implementation of the two aspects.
  • the computer device may be, for example, a network device, or may be a chip or chip system that supports the network device to implement the above method.
  • the eighth aspect of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores instructions. When the instructions are executed by a processor, the first aspect or any possible implementation of the first aspect is realized. The method provided by any one of the possible implementation modes of the second aspect or the second aspect.
  • a ninth aspect of the present application provides a chip system.
  • the chip system includes at least one processor.
  • the processor is configured to execute computer programs or instructions stored in a memory. When the computer program or instructions are executed on at least one processor, the aforementioned first step is implemented.
  • a tenth aspect of the present application provides a computer program product.
  • the computer program product includes computer program code.
  • the computer program code When the computer program code is executed on a computer, the first aspect or any possible implementation manner of the first aspect is realized.
  • a tenth aspect of this application provides a communication system.
  • the communication system includes a first federated learning client and a federated learning server.
  • the first federated learning client is used to implement the aforementioned first aspect or any possibility of the first aspect.
  • the federated learning server is used to implement the method provided by the aforementioned second aspect or any possible implementation manner of the second aspect.
  • Figure 1 is a schematic structural diagram of a computer system provided by an embodiment of the present application.
  • Figure 2 is a schematic flowchart of a reasoning method provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a reasoning structure provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of another reasoning structure provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of another reasoning structure provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a calculation graph provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of a knowledge graph provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of another reasoning structure provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of the acquisition method of local data on the federated learning server provided by the embodiment of this application.
  • Figure 10 is a schematic flowchart of a fixed parameter task selector generation provided by an embodiment of the present application.
  • Figure 11 is a schematic flow chart of task model selection provided by an embodiment of the present application.
  • Figure 12 is a schematic flowchart of model pre-training provided by the embodiment of the present application.
  • Figure 13 is a schematic flow chart of model training provided by an embodiment of the present application.
  • Figure 14 is a schematic flowchart of a model training according to the strategy provided by the embodiment of the present application.
  • Figure 15 is a schematic structural diagram of an inference device provided by an embodiment of the present application.
  • Figure 16 is a schematic structural diagram of another reasoning device provided by an embodiment of the present application.
  • Figure 17 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Figure 18 is a schematic structural diagram of another computer device provided by an embodiment of the present application.
  • At least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
  • Figure 1 is a schematic structural diagram of a computer system suitable for embodiments of the present application.
  • the computer system includes a federated learning server and multiple clients.
  • the federated learning server cooperates with multiple clients for model training and model-based inference. Among them, the number of clients can be adjusted according to actual needs.
  • embodiments of this application provide an inference method.
  • the server participates in the inference process.
  • the federated learning server obtains global prior features based on the local features of multiple clients, and adds the global prior features Features are sent to each client, allowing each client to use global prior features and local data for inference. Since the global prior features are obtained based on the local features of multiple clients, and the local features are extracted from the first network data, for each client, the data of other clients are used in the inference process. information, which can improve the accuracy of inference results.
  • the reasoning method provided by the embodiments of this application can be applied in a variety of application scenarios, for example, it can be applied to speech recognition scenarios and image recognition scenarios.
  • the speech recognition scenario Take the speech recognition scenario as an example.
  • a person uses a language assistant to perform a speech-to-text operation.
  • a single device cannot determine whether the received speech comes from the background (concert lyrics) or from the person who is recording the speech.
  • the voice information collected by other users' devices at the same concert is needed to determine whether the voice received by the device comes from the background (concert lyrics) or from the recording process. voice.
  • the inference method provided by the embodiment of the present application can be used to perform inference based on the voice information collected by multiple users' devices to infer the source of the voice (background or input by the user).
  • inference can be made through images captured by the cameras of multiple vehicles to infer the current traffic conditions of the road (no congestion, normal congestion, or congestion caused by traffic accidents).
  • the data used by the inference method provided by the embodiments of the present application can be data in various forms such as images, voices, text, etc.
  • part of the model needs to be used in the inference process, and this part of the model can be obtained through pre-training. For this reason, The following will first introduce the inference method provided by the embodiment of the present application, and then introduce the training method of the model used by the inference method.
  • Figure 2 is a schematic flow chart of an inference method provided by an embodiment of the present application, applied to an inference system.
  • the inference system includes a server and multiple clients. Specifically, it can be the inference system shown in Figure 1.
  • this embodiment specifically includes:
  • Step 201 The server sends task synchronization information to multiple clients, so that the multiple clients select the first network data from local data according to the task synchronization information, which represents the time requirement for the first network data.
  • the function of task synchronization information is to enable multiple clients to select the first network data related to the target network resource from local data to perform inference on the premise of meeting user needs.
  • the target network resource may be the current state of the target network.
  • the target network resource may be the fault state of the vehicle.
  • the first network data may include at least one of the following: positioning data, driving speed data, and driving mileage data of the first vehicle. , battery power data, battery voltage data, brake data, throttle data, motor voltage and current data, insulation resistance data, temperature data, etc., are not limited in the embodiments of this application.
  • the embodiments of the present application do not specifically limit the content of the task synchronization information.
  • the task synchronization information may be a specific time, so that the time of the first network data selected by multiple client task synchronization information from local data is the same. , or in other words, so that the time of the first network data selected by the multiple client task synchronization information from the local data is relatively close, or in other words, so that the time of the first network data selected by the multiple client task synchronization information from the local data is relatively close.
  • Data after dimensionality reduction of the number of data items for a period of time before the moment of data such as the average value, median value of multiple data items, or the principal component value obtained through dimensionality reduction such as principal component analysis, the cluster center value of the cluster, etc.
  • the first federated learning client is one of multiple clients.
  • the first federated learning client can receive task synchronization information from the server.
  • the task synchronization information represents the first network data among the local data of the first federated learning client. time requirements.
  • Step 202 The first federated learning client selects the first network data from the local data according to the task synchronization information.
  • the task synchronization information indicates a time point
  • the first federated learning client can select the data at this time point from local data as the first network data.
  • the first network data is the sampling value of the data related to the target network resource in the first time slot or the statistical value from the third time slot to the first time slot, and the third time slot is before the first time slot.
  • the task synchronization information includes the time of 10:10
  • the first federated learning client can select the local data of the time of 10:10 as the first network data based on the task synchronization information, or the first federated learning client can According to the task synchronization information, the client can select the local data at a time closer to 10:10 or the local data after dimensionality reduction along the data dimension as the first network data.
  • the first federated learning client can select 10 The local data from 9:00 to 10:11 is used as the first network data.
  • Step 203 The first federated learning client extracts the features of the first network data from the first network data, and the features of the first network data include the first local features.
  • the feature of the first network data may include one feature or multiple features; when the feature of the first network data includes multiple features, the local feature may be one or more of the multiple features.
  • the characteristics of the first network data are generally represented by feature vectors. Taking the characteristics of the first network data as including three characteristics as an example, the characteristics of the first network data can be expressed as Among them, the local feature can be one or more of these three features, which feature or features to transmit can be agreed in advance, or can be based on the current characteristics of the first network data, the characteristics of the first network data, the client computing resources, and the client Network resources are determined on the end and/or server side.
  • the determined strategy can be manually written logic or a model obtained through machine learning.
  • the embodiment of this application is based on For example, n is the identifier of the first federated learning client, which can be numbers 1, 2, 3, etc., and different n refers to different clients.
  • the features of the first network data are also referred to as implicit features below.
  • the first network data characteristic refers to the original data characteristic. For example, if the original data has a small amount of information, one feature will be directly transmitted. If the original data has a large amount of information, multiple features will be transmitted.
  • the information content of raw data can be calculated using entropy, or it can be calculated through a neural network or machine learning model.
  • the first federated learning client can extract features of the first network data through a local feature extraction model.
  • a local feature extraction model There are many types of local feature extraction models, which are not specifically limited in the embodiments of this application.
  • the local feature extraction model may include multiple second task models.
  • step 203 may specifically be: the first federated learning client will One network data is input to multiple secondary in the service model to obtain multiple features of the first network data output by the multiple second task models. That is, the first federated learning client inputs the first network data into different second task models according to types, and each second task model outputs features corresponding to the type.
  • the embodiment of the present application does not specifically limit the type of the second task model.
  • each second task model is a layer of autoencoders
  • the reconstruction target of the r-th task model is the residual of the r-1th task model, where r is an integer greater than 1. And represents the number of second task models.
  • an autoencoder is a neural network with the same input and learning objectives, and its structure is divided into two parts: an encoder and a decoder. After inputting the first network data, the hidden features output by the encoder, namely "encoded features", can be regarded as representations of the first network data.
  • the autoencoder can be implemented using a deep neural network (DNN), a convolutional neural network (CNN) or a Transformer neural network.
  • DNN deep neural network
  • CNN convolutional neural network
  • the local feature extraction model contains three layers of autoencoders.
  • the input of the autoencoder in layer 1 is the original feature X n , which is passed through the encoder Get hidden features Encoder It can be implemented using deep neural networks.
  • Hidden features The dimension of is lower than the dimension of the original feature X n .
  • Hidden features via encoder Corresponding decoder Get the reconstructed features of the original features X n decoder Deep neural networks can be used.
  • the input of the autoencoder in layer 2 is the original feature X n , which is passed through the encoder Get hidden features Encoder It can be implemented using deep neural networks.
  • Hidden features The dimension of is lower than the dimension of the original feature X n .
  • Hidden features via encoder Corresponding decoder Obtain the original features X n and the first layer model reconstruction difference reconstruction features
  • the input of the autoencoder in layer 3 is the original feature X n , which is passed through the encoder Get hidden features Encoder It can be implemented using deep neural networks.
  • the dimensionality of the latent feature h3 is lower than the dimensionality of the original feature Xn .
  • the latent feature h3 obtains the original feature X n and the first layer model reconstruction through the decoder D3 corresponding to the encoder E3 and layer 2 model reconstruction The difference between the sum reconstruction features
  • the pre-training process of the autoencoder can be:
  • the federated learning server delivers the n-th layer autoencoder model;
  • the client receives the n-th layer autoencoder model, uses local data to train the model for several rounds, and then uploads the n-th layer toilet model to Federated learning server; (3) Repeat (1) to (2) until the nth layer of autoencoders converges; (4) Repeat (1) to (3) to train the autoencoders of each layer layer by layer.
  • Step 204 The first federated learning client sends the first local feature to the server.
  • the first local feature is extracted from the first network data.
  • the first federated learning client is any one of multiple clients.
  • the server receives local features from multiple clients, and the local features are extracted from the first network data.
  • the first federated learning client selects the first local feature from the features of the first network data, it can send it to the server.
  • Other clients can also extract local features from the local first network data accordingly.
  • the features are sent to the server.
  • the second federated learning client selects the second local feature from the features of the local first network data and sends it to the server.
  • Step 205 The server obtains global prior features based on local features from multiple clients.
  • the federated learning server collects local features reported by multiple clients, and then processes these local features to determine global prior features based on the features of multiple clients.
  • the global prior features are determined through multi-client information. Obtained through fusion, the accuracy of the global prior features can be improved.
  • the federated learning server receives local features uploaded by multiple clients Local features uploaded by multiple clients
  • the input features are spliced into the global prior feature extraction model, and the global prior feature extraction model is input to obtain the global prior features corresponding to each output client.
  • the global prior feature extraction model uses DNN, CNN, Transformer or RNN model.
  • the input of the model also includes the previous output value.
  • the global prior feature extraction model can also be implemented using a lookup table, for example, the spliced local features are used as keys and the corresponding global prior features are used as values.
  • the first federated learning client can also send grouping information to the server.
  • the grouping information indicates the group in which the local feature is located, so that the server can determine the group based on the local features from multiple clients and the group in which the local feature is located. Obtain global prior features.
  • the first local feature uploaded by the first federated learning client includes the first sub-feature and the second sub-feature
  • the second local feature uploaded by the second federated learning client includes the third sub-feature and the fourth sub-feature
  • the grouping information from a federated learning client may indicate the grouping in which the first sub-feature is located and the grouping in which the second sub-feature is located
  • the grouping information from the second federated learning client indicates the grouping in which the third sub-feature is located and the grouping in which the fourth sub-feature is located.
  • the federated learning server can first process the first sub-feature and the third sub-feature to obtain the intermediate feature, and then group the intermediate feature
  • the processing result obtained after processing with the second sub-feature is processed with the fourth sub-feature to obtain the global prior feature.
  • Characteristics uploaded by different types of clients can be processed separately to reduce the impact between different types of clients.
  • the grouping information is information about grouping clients.
  • the intra-group corresponds to clients with the same land properties (farmland, roads, towns), and the inter-groups correspond to clients with different land properties. Different land properties have different impacts on signal transmission, so grouping is performed to reduce the impact between different land clients. For example, on a highway, the data of vehicles (1, 2, 3) in the same direction or in the same lane are in the same group, and the data of vehicles (4, 5, 6) in different directions are in another group, then the data of vehicles (1, 2, 3) are in another group.
  • the grouping information uploaded by 2, 3) indicates the same group, and the grouping information uploaded by vehicles (4, 5, 6) indicates the same group.
  • the federated learning server receives local features uploaded by multiple clients and corresponding group information
  • the grouping information can be obtained by checking in step 103 obtained by clustering respectively.
  • the input data can be input to the model one by one. belonging to the same group can be Add a global position encoding and then feed it into the model.
  • the global position encoding is shown in the figure below.
  • the sequence input to the model is
  • Input the spliced local features into the first layer model of the global prior feature extraction model to obtain the hidden features of the first layer. Represents the hidden features of the mth group in the first layer.
  • the spliced local features are grouped into the first layer and input into the second layer model of the global prior feature extraction model to obtain the second layer of each group.
  • the grouping of the second layer can be based on the grouping of the first layer, or it can be a regrouping of all local features.
  • the hidden features output by the second layer model of the global prior feature extraction model with local features Perform splicing, and input the splicing result into the third layer model of the global prior feature extraction model to obtain the global prior features corresponding to each output client.
  • Step 206 The federated learning server sends global prior features to multiple clients respectively.
  • the first federated learning client receives the global prior features.
  • the federated learning server can deliver the global a priori features to multiple connected clients.
  • Step 207 The first federated learning client performs inference based on the global prior features and the second network data to obtain the inference result.
  • the first federated learning client can perform inference on the local second network data based on the global prior features, and obtain the inference result of the second network data.
  • the second time slot is the same as the first time slot or after the first time slot, that is, the second network data can be at the same time as the first network data, or it can be data after the time of the first network data.
  • the first network data is data at 9:00
  • the second network data can be data at any time between 9:00 and 9:15, which is not limited in the embodiment of the present application.
  • the second network data can be directly processed, or the implicit features extracted from the second network data can be inferred, or one or more features can be selected from the implicit features for inference.
  • This application implements This example does not limit this.
  • the global prior feature is a feature vector or an inference model, and the inference model is used to output the inference result of the second network data according to the second network data.
  • the first federated learning client performs inference based on the global prior features, the second network data, and the local second machine learning model to obtain the inference results.
  • the second machine learning model is used to Inference on second network data.
  • the first federated learning client needs to input the global prior features and the second network data into the local second machine learning model for inference, and use the output result of the second machine learning model as the inference result.
  • the first federated learning client Before inputting the second network data into the second machine learning model, the first federated learning client can also input the second network data into the third machine learning model (such as a local feature extraction model) to obtain a representation that can Multiple features of the second network data characteristics can save computing resources.
  • the first federated learning client inputs the global prior features and the second network data into the local second machine learning model for inference.
  • the first federated learning client inputs the global prior features into the second machine.
  • the weights corresponding to multiple features of the second network data output by the second machine learning model can be obtained, that is, each feature corresponds to a weight, and then the weights corresponding to the multiple features of the second network data and the second network data can be obtained The respective weights of multiple features determine the inference results.
  • Figure 3 is a schematic diagram of an inference structure provided by an embodiment of the present application. As shown in Figure 3, the client calculates the weight values corresponding to different implicit features based on global prior features. Specifically, the local feature extraction model uses multi-layer autoencoding. device.
  • Client C n receives the global prior features issued by the federated learning server.
  • the inference model (second machine learning model) uses the DNN model type, client C n will have global prior features Input the inference model, and the inference model outputs multiple hidden features The corresponding weight value of and save the weight value
  • the client C n obtains the local data to be inferred at time t Extract multiple hidden features output from the model based on local features
  • the weight value output by the inference model and local anchor data set Calculate the distance r(t) n,m from the classification to each anchor point, where
  • Evaluation index of output stability maximum number of the same category/total number of hidden feature vectors.
  • the second machine learning model described above is a task model.
  • the task model does not need to process the second network data.
  • the following describes how the task model needs to process the second network data.
  • the second machine learning model does not need to process the second network data.
  • the model includes multiple first task models, which can process multiple features of the second network data respectively.
  • the first federated learning client obtains the global prior features, it can determine multiple first tasks from the global prior features.
  • the first federated learning client can input the features of the second network data into multiple first task models.
  • the input features of the multiple first task models can be features of the default type, that is, the first
  • the features of the second network data are classified according to categories and input to the first task model of the corresponding category.
  • the first federated learning client can then combine the inference features output by multiple first task models with the corresponding weights of the multiple first task models. Weighted average to obtain inference results.
  • the first federated learning client determines the respective weights of multiple first task models based on global prior features.
  • the first federated learning client needs to combine the global prior features and the second network data to calculate multiple
  • the respective weights of the first task model are calculated based on the characteristics of the data that need to be inferred to improve the accuracy of the weights.
  • the first federated learning client Before inputting the second network data into the second machine learning model, the first federated learning client can also input the second network data into a third learning model (for example, a local feature extraction model) to obtain an object that can reflect the third learning model.
  • a third learning model for example, a local feature extraction model
  • the multiple features of the second network data characteristics can save computing resources.
  • Figure 4 is a schematic diagram of another inference structure provided by the embodiment of the present application. Please refer to Figure 4.
  • the client C n receives the global prior features issued by the federated learning server. and save the global prior features Among them, the inference model uses a relaxed expert model.
  • the hybrid expert model includes a model selector and multiple first task models, such as task model 1 to task model N in the figure.
  • the client C n obtains the local data to be inferred at time t Enter the mixed expert model to calculate the inference results.
  • the hybrid expert model consists of a model selector, a task model group composed of multiple task models, and a classifier:
  • the input of the model selector is global prior features (second network data can also be added), and the output is a pair of "task models”
  • the weights of multiple task models in the "group” (the selection can be a 0-1 weight);
  • the "task model group” is composed of N "task models”, and each "task model” based on the input data, the output is Implicit feature vector;
  • the input of the classifier is the implicit feature vector after the weighted summation of the weights output by the model selector, and the output classification result.
  • Step 1 Global prior features (secondary network data can also be added) calculate the weight values of multiple task models through the "model selector”.
  • the global prior features are first input into the model selector model (the model can be implemented using DNN, CNN, Transformer), and the model parameters are express.
  • the input-output relationship of the model selector model is expressed as: or
  • the distance can use Euclidean distance, cosine distance or custom distance.
  • the anchor point set of the model selector model is:
  • the optimal distance is related to the specific method of calculating distance. If Euclidean distance is used, the optimal distance is the reciprocal of the minimum Euclidean distance; if cosine distance is used, the optimal distance is the maximum cosine distance.
  • Step 2 Calculate the implicit feature extraction value of the task model selected by the "Model Selector" with local original features.
  • the parameters of the mth task network are means that the input-output relationship of the m-th task network is expressed as:
  • Step 3 The "weighted summator” performs a weighted sum of the M weights output by the “model selector” on the M implicit feature vectors of the "task model group” (the implicit features of the unselected task models do not need to participate) calculate),
  • Step 4 The classifier is implemented using DNN, and the model parameters are The input-output relationship is:
  • the global prior features issued by the federated learning server can also be directly the model selector in the hybrid expert model.
  • the local feature extraction model can also be the hybrid expert model.
  • the local feature extraction model (the third machine learning model) may include multiple second task models, and the first federated learning client may extract features of the second network data through the third machine learning model.
  • the first federated learning client may The end first determines the weights corresponding to multiple second task models based on local data or second network data, and then inputs the second network data into the second task model to obtain the sub-features of the second network data output by the second task model.
  • the first federated learning client can perform a weighted average based on the weights corresponding to multiple second task models and the sub-features output by the second task model to obtain the characteristics of the second network data.
  • Figure 5 is a schematic diagram of another reasoning structure provided by an embodiment of the present application. Please refer to Figure 5.
  • the federated learning server generates task synchronization information and delivers the task synchronization information to multiple clients.
  • the client C n receives the task synchronization information issued by the federated learning server, and finds the data X n that satisfies the task synchronization information from the local data. Among them, meeting the task synchronization information means that the collection time of data X n is closest to the time of the task synchronization information.
  • Client C n obtains implicit features through the local local feature extraction model in As a local feature of client C n .
  • the local feature extraction model is implemented as a hybrid expert model, which consists of multiple task models i and a model selector. The data selects the task model through the model selector for feature extraction to obtain local features.
  • Task model i is trained using decoupled representation learning, that is, the local features output by the task model It can be divided into K groups. The characteristics within the group are similar and the characteristics between the groups are specific. obtained by different neural network structures are of different types, so It can be a scalar, a vector, a distance or a tensor, as follows:
  • the local feature extractor uses a fully connected neural network or RNN, such as but is a scalar, where W1 and W2 are model parameters and are a matrix; if exist is a vector, in and W1 are model parameters and are matrices.
  • RNN fully connected neural network
  • W1 and W2 are model parameters and are a matrix
  • W1 are model parameters and are matrices.
  • Introducing expressions for tensor calculations W2 is a three-dimensional tensor, is a tensor calculation; W2 can also be a higher tensor, in this case owned Can be a matrix or tensor.
  • the local feature extractor uses convolutional neural network or Transformer neural network, then is a matrix, taking an image as an example, the image is obtained after passing through the convolution layer of CNN at this time Represents the feature map of the m-th channel corresponding to the convolution kernel m, which is a matrix. If the data within a channel are averaged, then It's just a scalar.
  • Characteristics within a group are similar, and characteristics between groups are specific. Similarity and specificity can be measured by and The distance between them is obtained. If m1 and m2 belong to the same feature group, their distance value is small. If they belong to different feature groups, their distance value is large.
  • Client C n upload to the federated learning server.
  • the federated learning server receives local features uploaded by multiple clients.
  • the federated learning server can upload local features uploaded by multiple clients.
  • the input features are spliced into the global prior feature extraction model, the global prior feature extraction model is input, and the global prior features are output.
  • the global prior features are the task selectors of each client inference model. And the global prior features corresponding to each client (task selector of the inference model) are sent to the corresponding client C n .
  • the client C n receives the global prior features (task selector of the inference model) issued by the federated learning server, obtains the local data to be inferred at time t X(t) n through the collector, and talks about the data to be inferred X(t) n obtains the local feature h(t) n , and inputs the local feature h(t) n into the inference model to obtain the inference result.
  • the client C n receives the global prior features (task selector of the inference model) issued by the federated learning server, obtains the local data to be inferred at time t X(t) n through the collector, and talks about the data to be inferred X(t) n obtains the local feature h(t) n , and inputs the local feature h(t) n into the inference model to obtain the inference result.
  • the inference model uses a hybrid expert model, which consists of a model selector and a task model.
  • the model selector is a global prior feature (the task selector of the inference model).
  • Each task model corresponds to a feature group of local features h(t) n , and the features in each feature group are similar. That is, the input of task model 1 is the first set of features of h(t) n , the input of task model 2 is the second set of features of h(t) n , and the input of task model 2 is the nth set of features of h(t) n . feature.
  • Global prior features can also be directly inference models, such as a classifier or regressor with model parameters.
  • z represents the input of the classifier or regressor
  • Re represents the classifier or regressor model parameters.
  • the classifiers or regressors of each client can be the same or different. Should can be
  • Figure 6 is a schematic diagram of a calculation graph provided by an embodiment of the present application.
  • the global a priori feature can be a calculation graph with a set of feature values and calculation logic relationships, so as to For example, where Represents the k-th feature vector of the n-th client.
  • the feature vector can be used as part of the model parameters and learned through data during the training process, or it can be a manually set value;
  • C n represents the calculation flow graph of the n-th client.
  • the calculation graphs of each client can be the same or different.
  • Computational graphs can be obtained through data training or through knowledge graphs.
  • the left picture shows the pre-calculated distance between the input feature vector v and the feature value in the calculation diagram, that is, v and Calculating the cosine distance for each value in
  • the picture on the right is compared with the threshold after calculating the cosine distance. If it is greater than the threshold, it is 1, and if it is less than the threshold, it is 0, that is,
  • the global prior feature can be a knowledge graph, taking image recognition as an example.
  • Figure 7 is a schematic diagram of a knowledge graph provided by an embodiment of the present application.
  • the local feature extractor not only calculates the value of the feature but also calculates the position of the feature when participating in reasoning. Activation indicates that the sum of values of a channel of the local feature extractor or a combination of channels or a pixel value within a channel is greater than the threshold.
  • the client C n receives the global prior features issued by the federated learning server, and the calculation diagram or classifier/regressor Or knowledge graph, save the calculation graph locally or classifier/regressor Or knowledge graph.
  • the client C n obtains the local data to be inferred at time t X(t ) n through the collector, and extracts the multiple hidden features output from the model through the local feature extraction model Then Enter the calculation graph or classifier/regressor Or the knowledge graph gets the output results.
  • the first federated learning client uses sample data to train the first machine learning model, extracts the features of the second network data, and then inputs the features of the second network data into the training in the first machine learning model after training to obtain the inference result output by the first machine learning model after training.
  • the client C n receives the global prior features issued by the federated learning server, and the calculation diagram or classifier/regressor Or knowledge graph, initialize local reasoning local feature extraction model parameters, use or classifier/regressor Or the knowledge graph is used as a classifier or regressor to form an inference model, and then local data is used to train the inference model.
  • the local data here are historical data and corresponding historical reasoning results.
  • the client C n obtains the local data to be inferred at time t X(t) n through the collector, and the data to be inferred X(t) n is passed through the inference model Get the output.
  • the federated learning server can also determine the global prior features by combining the historical local features from the first federated learning client and the historical local features from the second federated learning client.
  • the federated learning server can calculate the similarity between the local features of the current reasoning process and multiple groups of historical local features, that is, determine the similarity between the first local feature and the historical local features from the first federated learning client in each group of historical local features.
  • multiple groups of historical local features may also have labels.
  • the labels are the actual results of each group of historical local features manually labeled, and can be used to compare with the inference results to determine the accuracy of the inference results.
  • the client can upload them to the federated learning server.
  • the federated learning server can determine the target inference results corresponding to the current global prior features based on the inference results of multiple clients, and can select from multiple sets of historical local features.
  • the historical local features of the target group labeled as the target inference result.
  • the local features of the current reasoning process and the historical local features of the target group can be The similarity of the historical local features of the target group is updated. If the current reasoning process The similarity between the local features of the target group and the historical local features of the target group is less than the threshold, then the local features of the current reasoning process can be saved locally as a new set of historical local features.
  • Figure 8 is a schematic diagram of another reasoning structure provided by the embodiment of the present application. Please refer to Figure 8.
  • the federated learning server receives local features uploaded by multiple clients. Compute local features through global prior model with the sample library similarity, and based on the shape similarity pair The corresponding hg(m) is weighted and summed as global prior characteristics.
  • the sample library of the federated learning server is represents the historical local feature set corresponding to the m-th sample data, and hg(m) represents the historical global prior feature corresponding to the m-th sample data.
  • the weighted sum is:
  • the federated learning server issues global prior features hg to each client.
  • Each client obtains the data to be inferred and combines the global prior features hg to obtain local inference results. Then upload the inference results to the federated learning server. If the client has a post-observation labeling function, upload the post-labeled labels to the federated learning server.
  • the federated learning server receives the inference results or annotated labels uploaded by the client, and collects statistics The class y corresponding to the largest number of classes, from Find all samples of the same category, that is, y(m) is the same as y. calculate The shape similarity of samples of the same class as the category, such as Euclidean distance, cosine distance or neural network. For sample data whose similarity is greater than the threshold, the corresponding hg(m) is updated according to the similarity. If all selected samples are does not meet the threshold, then the as a new example.
  • hg(m) r(m)*hg+(1-r(m))*hg(m).
  • the client can also train the global prior model, local feature extraction model, and inference model in Figure 3. Specifically, each client calculates the error between the inference results and labels, and uses the error Back propagation calculates the error of the global prior feature and uploads the error of the global prior feature to the federated learning server.
  • the federated learning server receives the error of the global prior features uploaded by each client, uses error backpropagation to calculate the error of the local features, and sends it to each client.
  • Each client and the federated learning server use their own errors to calculate the model gradient and update the model.
  • the client updates the inference model based on the error of the inference results and labels
  • the federated learning server updates the model based on the error of the inference results and labels.
  • the error of global prior features updates the global prior local feature extraction model.
  • the federated learning server can also save some data of multiple clients, use this part of the data as local data of the federated learning server, and train the initial inference model based on the local data of the federated learning server.
  • the details are not limited here.
  • Figure 9 is a schematic diagram of a method for obtaining local data by a federated learning server provided by an embodiment of this application.
  • the federated learning server uses local data to train a "differential decision model” and delivers it to the client side.
  • the client side uses the "difference judgment model” issued by the federated learning server to calculate the difference indicators, samples the difference indicators, and uploads the sampled difference indicators to the federated learning server.
  • the federated learning server collects the difference indicators for clustering to obtain the global difference indicator class center, and then clusters the global difference indicators The department's difference indicator class center issues it to the client.
  • the client side receives the differential indicator type center issued by the cloud side, classifies the local data on the client side to various centers and counts the classification results, and counts the number of data in each type of center, as well as the data far away from the class center. and the number of important data, and upload the statistical results to the federated learning server.
  • the local data on the client side includes ordinary data and important data.
  • the important data can be intrusion data, error data, voting inconsistent data, or data far away from the anchor point, which is not limited here.
  • the federated learning server receives the statistical results and combines them with local data to generate a collection strategy (which can be different for each client), and sends it to each client.
  • the client side receives the collection strategy issued by the federated learning server and collects data, samples the collected data, and uploads the sampled data to the federated learning server.
  • the federated learning server uses active learning to label the uploaded data and distributes it to each client.
  • the federated learning server uses local data to train the labeling model and delivers it to the client side.
  • the client side uses the labeling model to label local data, and then trains the local inference model based on the labeled local data and the labeled data delivered by the federated learning server.
  • the federated learning server can also compress the local data of the federated learning server, such as using a training data generator to compress the data. And the federated learning server can also train an inference model based on local data as the initial inference model of the client. The details will not be described here.
  • FIG. 10 is a schematic flowchart of a fixed parameter task selector generation provided by an embodiment of the present application. Please refer to Figure 10 for details:
  • Step 1001. The federated learning server generates K1 cluster center information according to categories.
  • Step 1002 The federated learning server delivers K1 class center information to the client.
  • Step 1003. The client receives K1 clustering center information, classifies K data into K1 clustering center information according to the clustering center information, counts the mean and number of data belonging to each clustering center, and uploads it to the federated learning service. end.
  • Step 1004. The client uploads the mean and number of data belonging to each cluster center to the federated learning server.
  • Step 1005. The federated learning server calculates the mean value of the data belonging to each global cluster center to obtain new cluster center information.
  • Step 1006 The federated learning server determines whether the new cluster center information has converged. If it has not converged, go to step 1002. If it has converged, go to step 1007.
  • the condition for convergence can be that the change of the new cluster center information compared with the previous cluster center information is less than the threshold, or the number of iterations has been reached.
  • Step 1007 The federated learning server issues new K1 cluster center information.
  • Step 1008 The client counts the number of K categories of data corresponding to the information in K1 cluster centers.
  • Step 1009 The client uploads data of K categories corresponding to the number of information in K1 cluster centers to the federated learning server.
  • Step 1010 The federated learning server receives the data of K classes uploaded by each client corresponding to the number of information in K1 cluster centers, and generates fixed parameter task selector rules.
  • Solution 1 For scenarios that require privacy protection:
  • the number of K categories of data uploaded by client n corresponding to the information in K1 cluster centers is the matrix A n :
  • the first step Calculate the A of global K category data corresponding to K1 cluster center information:
  • Step 2 For each category, select the cluster number corresponding to the cluster center information corresponding to topK1 with the largest number as the expert selected for that category.
  • topK1 values in b k [b k,1 ... b k,k1 ] that are not zero, and the non-zero positions in b k are the positions corresponding to topK1 in a k .
  • the value of the non-zero position in b k can be 1, or it can be a weighted value based on the amount of data.
  • the weighted value is:
  • each expert can use topK2 times at most. When the number of times an expert is selected exceeds topK2, the expert will no longer be selected in the selection of topK1.
  • the order of expert selection can be sorted by the amount of data the experts are selected from, starting with the expert with the largest amount of data. For example, if a k and k1 have the largest values, then experts will be selected from the kth category.
  • Option 2 For the scenario where the number of experts selected by each user is as small as possible as the optimization target:
  • Step 1 Normalize A n to the data belonging to different cluster center information in each category to obtain the probability that each category of data belongs to different cluster center information:
  • Step 2 Put Perform clustering of K3 cluster center information, and the information belonging to the same cluster center Be in a group.
  • Step 3 Same as option 1, except that the number of experts in each group is limited to more than topK4. If it exceeds topK4, only the expert with the largest amount of data among the experts corresponding to topK4 will be selected as the expert of the corresponding category.
  • Step 4 The matching relationship between categories and experts that are inconsistent between groups is based on the method with the most matching categories and experts. For example, assuming there are 10 groups, if expert 2 corresponds to the largest number of groups, select expert 2.
  • Figure 11 is a schematic flow chart of task model selection provided by an embodiment of the present application. Please refer to Figure 11:
  • Step 1101. The federated learning server issues fixed parameter task selector rules.
  • Step 1102. The client receives the fixed parameter task selector rule and counts the task model ID number corresponding to the local data.
  • Step 1103. The client uploads the task model ID to the federated learning server.
  • Step 1104. The federated learning server matches each client task model ID and matches the corresponding task model and parameter trainable task model selector.
  • Figure 12 is a schematic flowchart of model pre-training provided by the embodiment of the present application. Please refer to Figure 12:
  • Step 1201. The federated learning server issues a task model and a task model selector with trainable parameters.
  • Step 1202. The client receives a task model selector with trainable parameters and counts the task model ID number corresponding to the local data.
  • Step 1203. The client uploads the task model ID to the federated learning server.
  • Step 1204. The federated learning server matches each client task model ID and matches the corresponding task model.
  • Step 1205. The federated learning server issues the task model.
  • Step 1206 The client uses local data, uses the task model of the federated learning server as the initial value of the client's task model, selects the task model through the fixed parameter task selector, and trains the client's task model for several rounds.
  • Step 1207 The client uses local data, uses the parameter trainable task model selector of the federated learning server as the initial value of the client's parameter trainable task model selector, pseudo-labels the data through the fixed parameter task selector, and then The training parameters train the task model selector for several rounds.
  • Step 1208. The client uploads the updated client's task model and parameter-trainable task model selector.
  • Step 1209 The federated learning server receives the updated client's task model and parameter-trainable task model selector from each client, and weights the average to obtain the federated learning server's task model and parameter-trainable task model selector.
  • Step 1210 Repeat steps 1201 to 1209 several times.
  • Figure 13 is a schematic flow chart of model training provided by an embodiment of the present application. Please refer to Figure 13:
  • Step 1301. The federated learning server issues the parameter-trainable task model selector of the federated learning server.
  • Step 1302. The client receives the parameter-trainable task model selector from the federated learning server and counts the task model ID numbers corresponding to the local data.
  • Step 1303. The client uploads the task model ID number to the federated learning server.
  • Step 1304. The federated learning server receives the task model ID of each client and matches the task model of the corresponding federated learning server.
  • Step 1305. The federated learning server issues the corresponding task model of the federated learning server.
  • Step 1306 The client uses local data, uses the task model of the federated learning server as the initial model, selects the task model through the parameter-trainable task model selector of the federated learning server, and trains the client's task model for N rounds.
  • Step 1307 The client uploads the updated client's task model to the federated learning server.
  • the federated learning server receives the updated client task model from each client, and obtains the task model of the federated learning server through a weighted average.
  • Step 1309 The federated learning server delivers the updated task model of the federated learning server to the client.
  • Step 1310 The client uses local data, uses the task model of the federated learning server as the client's task model, and uses the parameter trainable task model selector of the federated learning server as the initial value of the client's parameter trainable task model selector. Select a task model through the client's parameter trainable task model selector and train the client's parameter trainable task model selector for N rounds.
  • Step 1311. The client and uploads the updated client's parameters to the trainable task model selector.
  • Step 1312. The federated learning server receives each updated task model selector with trainable client parameters, and uses a weighted average to obtain a task model selector with trainable parameters of the federated learning server.
  • Step 1313 Repeat steps 1301 to 1312 several times.
  • Figure 14 is a schematic flowchart of a model training according to the strategy provided by the embodiment of the present application. Please refer to Figure 14:
  • Step 1401. The federated learning server issues the parameter-trainable task model selector of the federated learning server.
  • Step 1402. The client receives the parameter-trainable task model selector from the federated learning server, and counts the number of various types of data processed by each task model of the local data.
  • Step 1403. The client uploads the statistical results to the federated learning server.
  • Step 1404. The federated learning server receives the number of various types of data processed by each client's task model, and counts the number of global data processed by each task model.
  • Step 1405. The federated learning server generates the corresponding relationship between the task model and various types of data.
  • Step 1406 The federated learning server issues the correspondence between the task model and various types of data.
  • Step 1407 The client uses local data, uses the task model of the federated learning server as the initial value of the client's task model, selects the client's task model as the student through the correspondence between the task model and various types of data, and uses the task model of the federated learning server as The teacher model trains the client's task model for N rounds through knowledge distillation.
  • Step 1408 The client uses local data, uses the parameter trainable task model selector of the federated learning server as the initial value of the client's parameter trainable task model selector, and falsifies the data through the correspondence between the task model and various types of data. Label training parameters trainable task model selector L6 round.
  • Step 1409 The client uploads the updated client’s task model and parameter-trainable task model selector.
  • Step 1410 The federated learning server receives each updated task model selector with trainable client parameters, and uses a weighted average to obtain a task model selector with trainable parameters of the federated learning server.
  • Step 1411. Repeat steps 1401 to 1410 several times.
  • the first federated learning client collects the first network data related to the target network resource in the first time slot, and extracts the first local feature of the first network data and sends it to the federated learning server.
  • Federated learning The server collects local features uploaded by multiple clients to calculate global prior features and sends them to each client.
  • the first federated learning client can reason about the second network data in the second time slot based on the global prior features.
  • information from other client data in addition to local data is used to improve the accuracy of the inference results.
  • the inference method is described above, and the device for executing the method is described below.
  • Figure 15 is a schematic structural diagram of an inference device provided by an embodiment of the present application. Please refer to Figure 15.
  • the device 150 includes:
  • the transceiver unit 1501 is used to send the first local feature to the federated learning server.
  • the first local feature is extracted from the first network data.
  • the first network data is obtained by the first federated learning client in the first time slot.
  • the target network resource is a network resource managed by the first federated learning client. It receives global prior features from the federated learning server. The global prior features are obtained based on the first local features and the second local features. , the second local feature is provided by the second federated learning client;
  • the processing unit 1502 is configured to perform inference based on global a priori features and second network data to obtain inference results.
  • the second network data is data related to the target network resource obtained by the first federated learning client in the second time slot,
  • the inference results are used to manage target network resources, where the second time slot is the same as or after the first time slot.
  • the transceiver unit 1501 is used to perform step 204 and step 206 in the method embodiment of Figure 2, and the processing unit 1502 is used to perform step 207 in the method embodiment of Figure 2.
  • the first network data is the sampling value of the data related to the target network resource in the first time slot or the statistical value from the third time slot to the first time slot
  • the second network data is the data related to the target network resource. Sample values in the second time slot, the third time slot preceding the first time slot.
  • the global prior feature is a feature vector or a first machine learning model
  • the first machine learning model is used for inference of the second network data.
  • the processing unit 1502 is specifically used to:
  • Inference is performed according to the global prior features, the second network data and the local second machine learning model to obtain the inference result, and the second machine learning model is used for inference of the second network data.
  • the processing unit 1502 is specifically configured to: input the second network data into the third learning model to obtain multiple features of the second network data output by the third learning model; input global prior features into the second In the machine learning model, the respective weights of multiple features of the second network data are obtained; and the inference results are determined based on the multiple features of the second network data and the respective weights of the multiple features of the second network data.
  • the second machine learning model includes multiple first task models; the processing unit 1502 is specifically configured to: calculate respective weights of multiple first task models based on global prior features; input the features of the second network data into Among the multiple first task models, inference features output by the multiple first task models are obtained; and the inference results are obtained based on respective weights of the multiple first task models and the inference features output by the multiple first task models.
  • the processing unit 1502 is specifically configured to calculate respective weights of multiple first task models based on global a priori features and second network data.
  • the processing unit is also used to extract features of the second network data through a third machine learning model.
  • the third machine learning model includes a plurality of second task models; the processing unit 1502 is specifically configured to: according to the second network data Determine respective weights of the plurality of second task models; input the second network data into the plurality of second task models to obtain sub-features of the second network data output by the plurality of second task models; according to the plurality of second tasks The respective weights of the models and the sub-features of the second network data output by the multiple second task models are used to obtain the characteristics of the second network data.
  • each second task model is a layer of autoencoders
  • the reconstruction target of the r-th task model among the multiple second task models is the residual of the r-1th task model, where r is greater than An integer of 1 and represents the number of second task models.
  • the processing unit 1502 is specifically configured to: extract the features of the second network data; input the features of the second network data into the first machine learning model, To obtain the inference result output by the first machine learning model.
  • the processing unit 1502 is specifically configured to: train the first machine learning model using sample data; extract features of the second network data; convert the second network data into The characteristics of the data are input into the trained first machine learning model to obtain the inference result output by the trained first machine learning model.
  • the transceiver unit 1501 is also configured to: send grouping information to the federated learning server, where the grouping information indicates the group in which the first local feature is located, so that the federated learning server determines the group based on the first local feature and the group in which the first local feature is located. , the second local feature and the group in which the second local feature is located to obtain the global prior feature.
  • the transceiver unit 1501 is also configured to: receive task synchronization information from the federated learning server, where the task synchronization information is used to indicate the first time slot; and select the first network data from the local data according to the task synchronization information.
  • Figure 16 is a schematic structural diagram of another inference device provided by an embodiment of the present application. Please refer to Figure 16.
  • the device 160 includes:
  • Transceiver unit 1601 configured to receive the first local feature from the first federated learning client.
  • the first local feature is extracted from the first network data.
  • the first network data is the first federated learning client in the first time slot. Obtained data related to the target network resource, the target network resource is the network resource managed by the first federated learning client;
  • the processing unit 1602 is configured to obtain global prior features based on the first local features and the second local features, where the second local features are provided by the second federated learning client;
  • the transceiver unit 1601 is configured to send global prior features to the first federated learning client, so that the first federated learning client performs inference based on the global prior features and the second network data to obtain the inference results.
  • the second network data is the third network data.
  • a federated learning client obtains data related to the target network resource in the second time slot, and the inference result is used to manage the target network resource, wherein the second time slot is the same as the first time slot or after the first time slot.
  • the transceiver unit 1601 is used to perform step 204 and step 206 in the method embodiment of Figure 2, and the processing unit 1602 is used to perform step 205 in the method embodiment of Figure 2.
  • the first network data is the sampling value of the data related to the target network resource in the first time slot or the statistical value from the third time slot to the first time slot
  • the second network data is the data related to the target network resource. Sample values in the second time slot, the third time slot preceding the first time slot.
  • the global prior feature is a feature vector or a first machine learning model
  • the first machine learning model is used for inference of the second network data.
  • the transceiving unit 1601 is also configured to: receive grouping information from the first federated learning client, where the grouping information from the first federated learning client indicates the grouping in which the first local feature is located;
  • the processing unit 1602 is specifically configured to: obtain global prior features based on the first local feature, the group in which the first local feature is located, the second local feature, and the group in which the second local feature is located.
  • the group in which the second local feature is located is obtained from the first local feature. Indicated by the grouping information of the second federated learning client.
  • the first local feature includes a first sub-feature and a second sub-feature
  • the second local feature includes a third sub-feature and a fourth sub-feature
  • the grouping information from the first federated learning client indicates where the first sub-feature is located.
  • the group and the group where the second sub-feature is located, the group information from the second federated learning client indicates the group where the third sub-feature is located and the group where the fourth sub-feature is located, and the group where the first sub-feature is located and the group where the third sub-feature is located
  • the groups are the same;
  • the processing unit 1602 is specifically configured to: based on the fact that the group where the first sub-feature is located and the group where the third sub-feature is located are the same, the federated learning server processes the first sub-feature and the third sub-feature to obtain the intermediate feature; based on the intermediate feature, The second sub-feature, the fourth sub-feature, The group where the second sub-feature is located and the group where the fourth sub-feature is located are used to obtain the global prior features.
  • the processing unit 1602 is specifically configured to obtain global prior features based on the first local features, the second local features, the historical local features from the first federated learning client, and the historical local features from the second federated learning client.
  • the processing unit 1602 is specifically configured to: calculate the similarity between the local features of the current reasoning process and multiple groups of historical local features.
  • the local features of the current reasoning process include first local features and second local features.
  • Each set of historical local features Including historical local features from the first federated learning client and historical local features from the second federated learning client in a historical reasoning process; based on the similarity between the local features of the current reasoning process and multiple groups of historical local features, multiple groups of historical local features are The historical prior features corresponding to the feature are weighted and summed to obtain the global prior features.
  • multiple groups of historical local features have labels, and the labels are the actual results of each group of human-labeled historical local features; the processing unit 1602 is also used to: receive the inference results from the first federated learning client; according to the first federated learning The inference result of the client and the inference result of the second federated learning client determine the target inference result; when the similarity between the local features of the current inference process and the historical local features of the target group is greater than or equal to the threshold, based on the local features of the current inference process and The similarity of the historical local features of the target group updates the historical local features of the target group.
  • the historical local features of the target group are among multiple groups of historical local features, and the label is the target inference result; between the local features of the current reasoning process and the historical local features of the target group When the similarity is less than the threshold, a set of historical local features is added to the multiple sets of historical local features, and the added set of historical local features are the local features of the current reasoning process.
  • the transceiver unit 1601 is also configured to: send task synchronization information to the first federated learning client.
  • the task synchronization information is used to indicate the first time slot, so that the first federated learning client obtains data from local data according to the task synchronization information. Select the first network data.
  • Figure 17 is a schematic diagram of a possible logical structure of the computer device 170 provided by the embodiment of the present application.
  • Computer device 170 includes: processor 1701, communication interface 1702, storage system 1703, and bus 1704.
  • the processor 1701, the communication interface 1702, and the storage system 1703 are connected to each other through a bus 1704.
  • the processor 1701 is used to control and manage the actions of the computer device 170.
  • the processor 1701 is used to execute the steps performed by the sending end in the method embodiment of Figure 2.
  • the communication interface 1702 is used to support the computer device 170 to communicate.
  • Storage system 1703 is used to store program codes and data of the computer device 170 .
  • the processor 1701 may be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with this disclosure.
  • the processor 1701 may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
  • the bus 1704 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the transceiver unit 1501 in the device 150 is equivalent to the communication interface 1702 in the computer device 170
  • the processing unit 1502 in the device 150 is equivalent to the processor 1701 in the computer device 170 .
  • the computer device 170 of this embodiment may correspond to the first federated learning client in the above-mentioned method embodiment of FIG. 2, and the communication interface 1702 in the computer device 170 may implement the first federated learning client in the above-mentioned method embodiment of FIG. 2.
  • the functions and/or various steps performed will not be described again here.
  • Figure 18 is a schematic diagram of another possible logical structure of the computer device 180 provided by the embodiment of the present application.
  • Computer device 180 includes: processor 1801, communication interface 1802, storage system 1803, and bus 1804.
  • the processor 1801, the communication interface 1802, and the storage system 1803 are connected to each other through a bus 1804.
  • the processor 1801 is used to control and manage the actions of the computer device 180.
  • the processor 1801 is used to execute the steps performed by the receiving end in the method embodiment of Figure 2.
  • the communication interface 1802 is used to support the computer device 180 to communicate.
  • Storage system 1803 is used to store program codes and data of the computer device 180 .
  • the processor 1801 may be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field-programmable gate array or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with this disclosure. The processor 1801 may also implement computing functions A combination of functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, etc.
  • the bus 1804 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 18, but it does not mean that there is only one bus or one type of bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the transceiver unit 1601 in the device 160 is equivalent to the communication interface 1802 in the computer device 180
  • the processing unit 1602 in the device 160 is equivalent to the processor 1801 in the computer device 180 .
  • the computer device 180 of this embodiment may correspond to the receiving end in the method embodiment of FIG. 2, and the communication interface 1802 in the computer device 180 may implement the functions and/or functions of the receiving end in the method embodiment of FIG. 2.
  • the various implementation steps are not repeated here for the sake of brevity.
  • each unit in the device can be a separate processing element, or it can be integrated and implemented in a certain chip of the device.
  • it can also be stored in the memory in the form of a program, and a certain processing element of the device can call and execute the unit. Function.
  • all or part of these units can be integrated together or implemented independently.
  • the processing element described here can also be a processor, which can be an integrated circuit with signal processing capabilities.
  • each step of the above method or each unit above can be implemented by an integrated logic circuit of hardware in the processor element or implemented in the form of software calling through the processing element.
  • the unit in any of the above devices may be one or more integrated circuits configured to implement the above method, such as: one or more application specific integrated circuits (ASIC), or one or Multiple microprocessors (digital signal processors, DSPs), or one or more field programmable gate arrays (FPGAs), or a combination of at least two of these integrated circuit forms.
  • ASIC application specific integrated circuits
  • DSPs digital signal processors
  • FPGAs field programmable gate arrays
  • the unit in the device can be implemented in the form of a processing element scheduler
  • the processing element can be a general processor, such as a central processing unit (Central Processing Unit, CPU) or other processors that can call programs.
  • CPU central processing unit
  • these units can be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • a computer-readable storage medium is also provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the processor of the device executes the computer-executed instructions
  • the device executes the above method embodiment.
  • a computer-readable storage medium is also provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the processor of the device executes the computer-executed instructions
  • the device executes the above method embodiment.
  • a computer program product includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium.
  • the processor of the device executes the computer execution instruction
  • the device executes the method executed by the first federated learning client in the above method embodiment.
  • a computer program product includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium.
  • the processor of the device executes the computer execution instruction
  • the device executes the method executed by the federated learning server in the above method embodiment.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may It is an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种推理方法及相关装置。本申请的方法应用于推理系统,推理系统包括服务端和多个客户端,该方法包括:第一联邦学习客户端采集第一时隙的与目标网络资源有关的第一网络数据,并提取该第一网络数据的第一局部特征发送给联邦学习服务端,联邦学习服务端收集多个客户端上传的局部特征计算全局先验特征,并下发给各客户端,第一联邦学习客户端可以根据该全局先验特征对第二时隙的第二网络数据进行推理,在推理过程中利用了除本地数据外其他客户端的数据的信息,用于提高推理结果的准确性。

Description

一种推理方法及相关装置
本申请要求与2022年8月11日提交中国国家知识产权局,申请号为202210962642.4,发明名称为“一种推理方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种推理方法及相关装置。
背景技术
联邦学习是一种在不汇聚多方(不同组织或者用户)原始数据的情况下,多方利用各自所有的数据协同训练人工智能模型的分布式机器学习范式。传统的机器学习范式需要汇聚大量的原始数据用于模型的训练,而用于训练的原始数据很有可能来自多个不同的组织或者用户。将多个不同组织或者不同用户的原始数据汇聚在一起,极有可能造成数据泄露的风险,对组织来说会暴露信息资产,对个人用户来说可能泄露个人隐私。上述问题的存在对人工智能模型的训练提出了严峻的挑战,为解决上述问题,联邦学习技术应运而生。联邦学习允许多方原始数据保留在本地不进行多方数据汇聚,多方之间通过协同计算(安全的)交互中间计算结果的方式来共同训练人工智能模型。通过联邦学习技术,即保护了多方用户数据,又可以充分利用多方数据来协同训练模型,从而得到更强大的模型。
典型的联邦学习按照场景可以分为横向联邦,纵向联邦,联邦迁移(联邦域自适应)三种范式,上述三种范式分别解决三种典型的场景。
在横向联邦学习架构中,通常存在一个服务端以及参与横向联邦的不同客户端。在训练阶段,客户端使用本地数据进行训练,并将训练后的模型上传服务端;服务端会对所有客户端上传的模型进行加权平均,从而得到全局模型。该全局模型会下发到客户端,用于客户端的推理。
在推理阶段,客户端基于服务端下发的全局模型进行推理,并根据推理结果管理所在的设备的资源;但由于仅使用本地数据进行推理,所以会导致客户端的推理结果不够准确,进而造成设备的资源管理混乱。
发明内容
本申请提供了一种推理方法及相关装置,该方法在推理过程中利用了除本地数据外其他客户端的数据的信息,用于提高推理结果的准确性。
本申请第一方面提供了一种推理方法,该方法包括:第一联邦学习客户端向所述联邦学习服务端发送第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;所述第一联邦学习客户端接收来自所述联邦学习服务端的全局先验特征,所述全局先验特征是根据所述第一局部特征和第二局部特征得到的,所述第二局部特征是由第二联邦学习客户端提供;所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理目标网络资源,其中,第二时隙与第一时隙相同或在第一时隙之后。
上述方面中,第一联邦学习客户端采集第一时隙的与目标网络资源有关的第一网络数据,并提取该第一网络数据的第一局部特征发送给联邦学习服务端,联邦学习服务端收集多个客户端上传的局部特征计算全局先验特征,并下发给各客户端,第一联邦学习客户端可以根据该全局先验特征对第二时隙的第二网络数据进行推理,在推理过程中利用了除本地数据外其他客户端的数据的信息,用于提高推理结果的准确性。
一种可能的实施方式中,第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第二网络数据为目标网络资源有关的数据在第二时隙的采样值,第三时隙在第一时隙之前。
一种可能的实施方式中,全局先验特征为特征向量或第一机器学习模型,第一机器学习模型用于第 二网络数据的推理。
一种可能的实施方式中,在全局先验特征为特征向量的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,第二机器学习模型用于第二网络数据的推理。
上述可能的实施方式中,第一联邦学习客户端需要将全局先验特征和第二网络数据输入到本地的第二机器学习模型中进行推理,将该第二机器学习模型输出的结果作为推理结果,使用可训练的第二机器学习模型进行推理,可以提高方案的准确度。
一种可能的实施方式中,第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:第一联邦学习客户端将第二网络数据输入到第三学习模型中,以得到第三学习模型输出的第二网络数据的多个特征;第一联邦学习客户端将全局先验特征输入到第二机器学习模型中,以获得第二网络数据的多个特征各自的权重;第一联邦学习客户端根据第二网络数据的多个特征以及第二网络数据的多个特征各自的权重,确定推理结果。
上述可能的实施方式中,在将第二网络数据输入到第二机器学习模型中之前,第一联邦学习客户端还可以将第二网络数据输入到第三机器学习模型,以获得可以体现第二网络数据特点的多个特征,可以节省计算资源。第一联邦学习客户端将全局先验特征和第二网络数据输入到本地的第二机器学习模型中进行推理的方式可以是,第一联邦学习客户端将全局先验特征输入到该第二机器学习模型中,可以获得第二机器学习模型输出的第二网络数据的多个特征对应的权重,即每个特征对应一个权重,然后即可根据第二网络数据的多个特征以及第二网络数据的多个特征各自的权重确定推理结果,不同特征对应不同的权重,提高推理的准确性。
一种可能的实施方式中,第二机器学习模型包括多个第一任务模型;第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:第一联邦学习客户端根据全局先验特征,计算多个第一任务模型各自的权重;第一联邦学习客户端将第二网络数据的特征输入到多个第一任务模型中,以得到多个第一任务模型输出的推理特征;第一联邦学习客户端根据多个第一任务模型各自的权重以及多个第一任务模型输出的推理特征,得到推理结果。
上述可能的实施方式中,第二机器学习模型包括多个第一任务模型,可以对第二网络数据的多个特征分别进行处理,具体的,第一联邦学习客户端获得全局先验特征后,可以从该全局先验特征确定多个第一任务模型各自的权重,第一联邦学习客户端可以将第二网络数据的特征输入到多个第一任务模型中,该多个第一任务模型的输入特征可以是默认类型的特征,即第二网络数据的特征按照类别分类,并输入到对应类别的第一任务模型,第一联邦学习客户端即可以对多个第一任务模型输出的推理特征结合多个第一任务模型对应的权重进行加权平均,以获得推理结果。对不同类型的特征在不同的第一任务模型中训练,再基于权重加权平均,可以提高推理的准确性。
一种可能的实施方式中,第一联邦学习客户端根据全局先验特征,计算多个第一任务模型各自的权重包括:第一联邦学习客户端根据全局先验特征和第二网络数据,计算多个第一任务模型各自的权重。
上述可能的实施方式中,第一联邦学习客户端结合客户端本地数据和全局先验特征计算第一任务模型的权重,采用具有实时性的第二网络数据,提高权重的准确性。
一种可能的实施方式中,该方法还包括:第一联邦学习客户端通过第三机器学习模型提取第二网络数据的特征。
上述可能的实施方式中,采用第二网络数据的特征进行推理,减少推理计算量。
一种可能的实施方式中,第三机器学习模型包括多个第二任务模型;第一联邦学习客户端通过第三机器学习模型提取第二网络数据的特征包括:第一联邦学习客户端根据第二网络数据确定多个第二任务模型各自的权重;第一联邦学习客户端将第二网络数据输入到多个第二任务模型中,以得到多个第二任务模型输出的第二网络数据的子特征;根据多个第二任务模型各自的权重和多个第二任务模型输出的第二网络数据的子特征,得到第二网络数据的特征。
上述可能的实施方式中,第一联邦学习客户端先根据本地数据或者第二网络数据确定多个第二任务 模型对应的权重,然后将第二网络数据输入到第二任务模型中,可以获得第二任务模型输出的第二网络数据的子特征,第一联邦学习客户端即可根据多个第二任务模型对应的权重以及第二任务模型输出的子特征进行加权平均,以获得第二网络数据的特征,对不同类型的特征在不同的第一任务模型中训练,再基于权重加权平均,可以提高推理的准确性。
一种可能的实施方式中,每个第二任务模型是一层自编码器,多个第二任务模型中第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示第二任务模型的数量。
一种可能的实施方式中,在全局先验特征为第一机器学习模型的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端提取第二网络数据的特征;第一联邦学习客户端将第二网络数据的特征输入到第一机器学习模型中,以得到第一机器学习模型输出的推理结果。
上述可能的实施方式中,联邦学习服务端可以直接下发第一机器学习模型作为推理模型,提高方案的灵活性。
一种可能的实施方式中,在全局先验特征为第一机器学习模型的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端利用样本数据对第一机器学习模型进行训练;第一联邦学习客户端提取第二网络数据的特征;第一联邦学习客户端将第二网络数据的特征输入到训练后的第一机器学习模型中,以得到训练后的第一机器学习模型输出的推理结果。
上述可能的实施方式中,联邦学习服务端直接下发第一机器学习模型作为推理模型时,客户端还可以基于本地的样本数据进一步训练,提高推理模型的准确性。
一种可能的实施方式中,该方法还包括:第一联邦学习客户端向联邦学习服务端发送分组信息,分组信息指示第一局部特征所在的分组,以使得联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征。
上述可能的实施方式中,第一联邦学习客户端还可以向服务端发送分组信息,分组信息指示局部特征所在的分组,以使得服务端根据来自多个客户端的局部特征以及局部特征所在的分组得到全局先验特征,避免所有客户端的数据一起确定全局先验特征会影响组间的输出。
一种可能的实施方式中,该方法还包括:第一联邦学习客户端接收来自联邦学习服务端的任务同步信息,任务同步信息用于指示第一时隙;第一联邦学习客户端根据任务同步信息从本地数据中选择出第一网络数据。
上述可能的实施方式中,有联邦学习服务端指示客户端上传的局部特征所属的时隙,提高客户端上传的数据的同步性。
本申请第二方面提供了一种推理方法,该方法包括:联邦学习服务端接收来自第一联邦学习客户端的第一局部特征,第一局部特征是从第一网络数据中提取出的,第一网络数据为第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,目标网络资源为第一联邦学习客户端管理的网络资源;联邦学习服务端根据第一局部特征和第二局部特征,得到全局先验特征,第二局部特征是由第二联邦学习客户端提供;联邦学习服务端向第一联邦学习客户端发送全局先验特征,使得第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果,第二网络数据为第一联邦学习客户端在第二时隙获取的与目标网络资源有关的数据,推理结果用于管理目标网络资源,其中,第二时隙与第一时隙相同或在第一时隙之后。
一种可能的实施方式中,第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第二网络数据为目标网络资源有关的数据在第二时隙的采样值,第三时隙在第一时隙之前。
一种可能的实施方式中,全局先验特征为特征向量或第一机器学习模型,第一机器学习模型用于第二网络数据的推理。
一种可能的实施方式中,该方法还包括:联邦学习服务端接收来自第一联邦学习客户端的分组信息,来自第一联邦学习客户端的分组信息指示第一局部特征所在的分组;联邦学习服务端根据第一局部特征 和第二局部特征,得到全局先验特征包括:联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征,第二局部特征所在的分组是由来自第二联邦学习客户端的分组信息指示的。
一种可能的实施方式中,第一局部特征包括第一子特征和第二子特征,第二局部特征包括第三子特征和第四子特征,来自第一联邦学习客户端的分组信息指示第一子特征所在的分组和第二子特征所在的分组,来自第二联邦学习客户端的分组信息指示第三子特征所在的分组和第四子特征所在的分组,且第一子特征所在的分组和第三子特征所在的分组相同;联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征包括:基于第一子特征所在的分组和第三子特征所在的分组相同,联邦学习服务端对第一子特征和第三子特征进行处理,得到中间特征;联邦学习服务端根据中间特征、第二子特征、第四子特征、第二子特征所在的分组以及第四子特征所在的分组,得到全局先验特征。
上述可能的实施方式中,第一联邦学习客户端上传的第一局部特征包括第一子特征和第二子特征,第二联邦学习客户端上传的第二局部特征包括第三子特征和第四子特征,来自第一联邦学习客户端的分组信息可以指示第一子特征所在的分组和第二子特征所在的分组,来自第二联邦学习客户端的分组信息指示第三子特征所在的分组和第四子特征所在的分组,假设第一子特征所在的分组和第三子特征所在的分组相同,则联邦学习服务端可以先对第一子特征和第三子特征进行处理,得到中间特征,再按照分组,将中间特征与第二子特征进行处理后获得的处理结果与第四子特征进行处理,以获得全局先验特征。对于不同类型的客户端上传的特征可以分开处理,减少不同类型客户端之间的影响。
一种可能的实施方式中,联邦学习服务端根据第一局部特征和第二局部特征,得到全局先验特征包括:联邦学习服务端根据第一局部特征、第二局部特征、来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征,得到全局先验特征。
上述可能的实施方式中,联邦学习服务端确定全局先验特征除了需要第一局部特征和第二局部特征外,还可以结合来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征进行确定,提高全局先验特征的准确性。
一种可能的实施方式中,联邦学习服务端根据第一局部特征、第二局部特征、来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征,得到全局先验特征包括:联邦学习服务端计算当前推理过程的局部特征与多组历史局部特征的相似度,当前推理过程的局部特征包括第一局部特征和第二局部特征,每组历史局部特征包括一次历史推理过程中来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征;联邦学习服务端根据当前推理过程的局部特征与多组历史局部特征的相似度,对多组历史局部特征对应的历史先验特征进行加权求和,以得到全局先验特征。
上述可能的实施方式中,联邦学习服务端可以计算当前推理过程的局部特征与多组历史局部特征的相似度,即确定第一局部特征与各组历史局部特征中的来自第一联邦学习客户端的历史局部特征的相似度,以及确定第二局部特征与各组历史局部特征中的来自第二联邦学习客户端的历史局部特征的相似度,然后基于相似度对多组历史局部特征对应的历史先验特征进行加权求和,获得所需的全局先验特征,提高全局先验特征的准确性。
一种可能的实施方式中,多组历史局部特征都具有标签,标签为人为标注的每组历史局部特征的实际结果;该方法还包括:联邦学习服务端接收来自第一联邦学习客户端的推理结果;联邦学习服务端根据第一联邦学习客户端的推理结果和第二联邦学习客户端的推理结果确定目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的情况下,联邦学习服务端根据当前推理过程的局部特征与目标组历史局部特征的相似度对目标组历史局部特征进行更新,目标组历史局部特征为多组历史局部特征中,标签为目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度小于阈值的情况下,联邦学习服务端在多组历史局部特征的基础上增加一组历史局部特征,增加的一组历史局部特征为当前推理过程的局部特征。
上述可能的实施方式中,多组历史局部特征还可以具有标签,该标签是人为标注的每组历史局部特 征的实际结果,可以用来和推理结果比较,来确定推理结果的准确度。客户端在获得推理结果后,可以上传给联邦学习服务端,联邦学习服务端可以根据多个客户端的推理结果确定当前全局先验特征对应的目标推理结果,则可以从多组历史局部特征中选择标签为该目标推理结果的目标组历史局部特征,并将当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的基于相似度更新该目标组历史局部特征,小于阈值的则作为新的一组历史局部特征,可以随时更新和补充样本库,提高样本库的有效性。
一种可能的实施方式中,该方法还包括:联邦学习服务端向第一联邦学习客户端发送任务同步信息,任务同步信息用于指示第一时隙,以使得第一联邦学习客户端根据任务同步信息从本地数据中选择出第一网络数据。
本申请第三方面提供了一种推理系统,该系统包括:第一联邦学习客户端向联邦学习服务端发送第一局部特征,第一局部特征是从第一网络数据中提取出的,第一网络数据为第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,目标网络资源为第一联邦学习客户端管理的网络资源;联邦学习服务端根据第一局部特征和第二局部特征,得到全局先验特征,第二局部特征是由第二联邦学习客户端提供;联邦学习服务端向第一联邦学习客户端发送全局先验特征;第一联邦学习客户端接收来自联邦学习服务端的全局先验特征;第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果,第二网络数据为第一联邦学习客户端在第二时隙获取的与目标网络资源有关的数据,推理结果用于管理目标网络资源,其中,第二时隙与第一时隙相同或在第一时隙之后。
一种可能的实施方式中,第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第二网络数据为目标网络资源有关的数据在第二时隙的采样值,第三时隙在第一时隙之前。
一种可能的实施方式中,全局先验特征为特征向量或第一机器学习模型,第一机器学习模型用于第二网络数据的推理。
一种可能的实施方式中,在全局先验特征为特征向量的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,第二机器学习模型用于第二网络数据的推理。
一种可能的实施方式中,第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:第一联邦学习客户端将第二网络数据输入到第三学习模型中,以得到第三学习模型输出的第二网络数据的多个特征;第一联邦学习客户端将全局先验特征输入到第二机器学习模型中,以获得第二网络数据的多个特征各自的权重;第一联邦学习客户端根据第二网络数据的多个特征以及第二网络数据的多个特征各自的权重,确定推理结果。
一种可能的实施方式中,第二机器学习模型包括多个第一任务模型;第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:第一联邦学习客户端根据全局先验特征,计算多个第一任务模型各自的权重;第一联邦学习客户端将第二网络数据的特征输入到多个第一任务模型中,以得到多个第一任务模型输出的推理特征;第一联邦学习客户端根据多个第一任务模型各自的权重以及多个第一任务模型输出的推理特征,得到推理结果。
一种可能的实施方式中,第一联邦学习客户端根据全局先验特征,计算多个第一任务模型各自的权重包括:第一联邦学习客户端根据全局先验特征和第二网络数据,计算多个第一任务模型各自的权重。
一种可能的实施方式中,系统还包括:第一联邦学习客户端通过第三机器学习模型提取第二网络数据的特征。
一种可能的实施方式中,第三机器学习模型包括多个第二任务模型;第一联邦学习客户端通过第三机器学习模型提取第二网络数据的特征包括:第一联邦学习客户端根据第二网络数据确定多个第二任务模型各自的权重;第一联邦学习客户端将第二网络数据输入到多个第二任务模型中,以得到多个第二任务模型输出的第二网络数据的子特征;根据多个第二任务模型各自的权重和多个第二任务模型输出的第二网络数据的子特征,得到第二网络数据的特征。
一种可能的实施方式中,每个第二任务模型是一层自编码器,多个第二任务模型中第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示第二任务模型的数量。
一种可能的实施方式中,在全局先验特征为第一机器学习模型的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端提取第二网络数据的特征;第一联邦学习客户端将第二网络数据的特征输入到第一机器学习模型中,以得到第一机器学习模型输出的推理结果。
一种可能的实施方式中,在全局先验特征为第一机器学习模型的情况下,第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果包括:第一联邦学习客户端利用样本数据对第一机器学习模型进行训练;第一联邦学习客户端提取第二网络数据的特征;第一联邦学习客户端将第二网络数据的特征输入到训练后的第一机器学习模型中,以得到训练后的第一机器学习模型输出的推理结果。
一种可能的实施方式中,系统还包括:第一联邦学习客户端向联邦学习服务端发送分组信息,第一联邦学习客户端的分组信息指示第一局部特征所在的分组;联邦学习服务端接收来自第一联邦学习客户端的分组信息;联邦学习服务端根据第一局部特征和第二局部特征,得到全局先验特征包括:联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征,第二局部特征所在的分组是由来自第二联邦学习客户端的分组信息指示的。
一种可能的实施方式中,第一局部特征包括第一子特征和第二子特征,第二局部特征包括第三子特征和第四子特征,来自第一联邦学习客户端的分组信息指示第一子特征所在的分组和第二子特征所在的分组,来自第二联邦学习客户端的分组信息指示第三子特征所在的分组和第四子特征所在的分组,且第一子特征所在的分组和第三子特征所在的分组相同;联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征包括:基于第一子特征所在的分组和第三子特征所在的分组相同,联邦学习服务端对第一子特征和第三子特征进行处理,得到中间特征;联邦学习服务端根据中间特征、第二子特征、第四子特征、第二子特征所在的分组以及第四子特征所在的分组,得到全局先验特征。
一种可能的实施方式中,联邦学习服务端根据第一局部特征和第二局部特征,得到全局先验特征包括:联邦学习服务端根据第一局部特征、第二局部特征、来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征,得到全局先验特征。
一种可能的实施方式中,联邦学习服务端根据第一局部特征、第二局部特征、来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征,得到全局先验特征包括:联邦学习服务端计算当前推理过程的局部特征与多组历史局部特征的相似度,当前推理过程的局部特征包括第一局部特征和第二局部特征,每组历史局部特征包括一次历史推理过程中来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征;联邦学习服务端根据当前推理过程的局部特征与多组历史局部特征的相似度,对多组历史局部特征对应的历史先验特征进行加权求和,以得到全局先验特征。
一种可能的实施方式中,多组历史局部特征都具有标签,标签为人为标注的每组历史局部特征的实际结果;系统还包括:第一联邦学习客户端向联邦学习服务端发送第一联邦学习客户端的推理结果;联邦学习服务端根据第一联邦学习客户端的推理结果和第二联邦学习客户端的推理结果确定目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的情况下,联邦学习服务端根据当前推理过程的局部特征与目标组历史局部特征的相似度对目标组历史局部特征进行更新,目标组历史局部特征为多组历史局部特征中,标签为目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度小于阈值的情况下,联邦学习服务端在多组历史局部特征的基础上增加一组历史局部特征,增加的一组历史局部特征为当前推理过程的局部特征。
一种可能的实施方式中,系统还包括:联邦学习服务端向第一联邦学习客户端发送任务同步信息,任务同步信息用于指示第一时隙;第一联邦学习客户端根据任务同步信息从本地数据中选择出第一网络数据。
本申请第四方面提供了一种推理装置,可以实现上述第一方面或第一方面中任一种可能的实施方式中的方法。该装置包括用于执行上述方法的相应的单元或模块。该装置包括的单元或模块可以通过软件和/或硬件方式实现。该装置例如可以为网络设备,也可以为支持网络设备实现上述方法的芯片、芯片系统、或处理器等,还可以为能实现全部或部分网络设备功能的逻辑模块或软件。
本申请第五方面提供了一种推理装置,可以实现上述第二方面或第二方面中任一种可能的实施方式中的方法。该装置包括用于执行上述方法的相应的单元或模块。该装置包括的单元或模块可以通过软件和/或硬件方式实现。该装置例如可以为网络设备,也可以为支持网络设备实现上述方法的芯片、芯片系统、或处理器等,还可以为能实现全部或部分网络设备功能的逻辑模块或软件。
本申请第六方面提供了一种计算机设备,包括:处理器,该处理器与存储器耦合,该存储器用于存储指令,当指令被处理器执行时,使得该计算机设备实现上述第一方面或第一方面中任一种可能的实施方式中的方法。该计算机设备例如可以为网络设备,也可以为支持网络设备实现上述方法的芯片或芯片系统等。
本申请第七方面提供了一种计算机设备,包括:处理器,该处理器与存储器耦合,该存储器用于存储指令,当指令被处理器执行时,使得该计算机设备实现上述第二方面或第二方面中任一种可能的实施方式中的方法。该计算机设备例如可以为网络设备,也可以为支持网络设备实现上述方法的芯片或芯片系统等。
本申请第八方面提供了一种计算机可读存储介质,该计算机可读存储介质中保存有指令,当该指令被处理器执行时,实现前述第一方面或第一方面任一种可能的实施方式、第二方面或第二方面中任一种可能的实施方式提供的方法。
本申请第九方面提供了一种芯片系统,芯片系统包括至少一个处理器,处理器用于执行存储器中存储的计算机程序或指令,当计算机程序或指令在至少一个处理器上执行时,实现前述第一方面或第一方面任一种可能的实施方式、第二方面或第二方面中任一种可能的实施方式提供的方法。
本申请第十方面提供了一种计算机程序产品,计算机程序产品中包括计算机程序代码,当该计算机程序代码在计算机上执行时,实现前述第一方面或第一方面任一种可能的实施方式、第二方面或第二方面中任一种可能的实施方式提供的方法。
本申请第十方面提供了一种通信系统,该通信系统包括第一联邦学习客户端和联邦学习服务端,该第一联邦学习客户端用于实现前述第一方面或第一方面任一种可能的实施方式,该联邦学习服务端用于实现前述第二方面或第二方面中任一种可能的实施方式提供的方法。
附图说明
图1为本申请实施例提供的一种计算机系统的结构示意图;
图2为本申请实施例提供的一种推理方法的流程示意图;
图3为本申请实施例提供的一种推理结构示意图;
图4为本申请实施例提供的另一种推理结构示意图;
图5为本申请实施例提供的另一种推理结构示意图;
图6为本申请实施例提供的一种计算图的示意图;
图7为本申请实施例提供的一种知识图谱的示意图;
图8为本申请实施例提供的另一种推理结构示意图;
图9为本申请实施例提供的联邦学习服务端的本地数据的获取方式示意图;
图10为本申请实施例提供的一种固定参数任务选择器生成的流程示意图;
图11为本申请实施例提供的一种任务模型选择的流程示意图;
图12为本申请实施例提供的一种模型预训练的流程示意图;
图13为本申请实施例提供的一种模型训练的流程示意图;
图14为本申请实施例提供的一种模型按策略训练的流程示意图;
图15为本申请实施例提供的一种推理装置的结构示意图;
图16为本申请实施例提供的另一种推理装置的结构示意图;
图17为本申请实施例提供的一种计算机设备的结构示意图;
图18为本申请实施例提供的另一种计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、第二”以及相应术语标号等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
在本申请的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请的描述中,“至少一项”是指一项或者多项,“多项”是指两项或两项以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
图1是适用于本申请实施例的计算机系统的结构示意图,如图1所示,该计算机系统包括联邦学习服务端和多个客户端。联邦学习服务端和多个客户端配合,用于模型的训练和基于模型的推理。其中,客户端的数量可以根据实际需求进行调整。
应理解,在推理阶段,某个客户端仅使用本地数据进行推理,而不使用其他客户端的数据进行推理,尽管可以保护其他客户端的隐私,但可能导致推理结果不够准确。
为此,本申请实施例提供了一种推理方法,在该方法中,服务端参与推理过程,具体地,联邦学习服务端根据多个客户端的局部特征得到全局先验特征,并将全局先验特征发送至各个客户端,使得各个客户端利用全局先验特征和本地数据进行推理。由于全局先验特征是根据多个客户端的局部特征得到的,局部特征是从第一网络数据中提取出的,所以对于每个客户端来说,在推理过程中都利用了其他客户端的数据的信息,可以提高推理结果的准确性。
本申请实施例提供的推理方法的应用场景可以有多种,例如可以应用于语音识别场景和图像识别场景。
以语音识别场景为例,在演唱会现场一个人在演唱会现场使用语言助手进行语音转文字操作,单个设备无法判断接收到的语音是来自背景(演唱会的歌词)还是来自正在录入语音的人,此时则需要同一个演唱会的其它用户的设备采集到的语音信息,以通过其它用户的设备采集到的语音信息判断设备接收到的语音是来自背景(演唱会的歌词)还是来自正在录入语音。在该场景下,则可以使用本申请实施例提供的推理方法,根据多个用户的设备采集到的语音信息进行推理,以推理出语音的来源(背景或用户录入的)。
以图像识别场景为例,在一条道路上行驶了很多辆车,每辆车上的摄像头都拍摄到了该条道路的图像。使用本申请实施例提供的推理方法,则可以通过多辆车的摄像头拍摄到的图像进行推理,以推理出当前道路的交通状况(不拥堵、正常拥堵或交通事故引发拥堵)。
本申请实施例提供的推理方法所使用的数据(包括下文中的本地数据、第一网络数据、第二网络数据等)可以是图像、语音、文字等多种形式的数据。
需要说明的是,在推理过程中需要用到一部分模型,这部分模型可以通过预先的训练得到,为此, 下文会先介绍本申请实施例提供的推理方法,然后介绍推理方法所使用的模型的训练方法。
下面先介绍本申请实施例提供的推理方法。
图2是本申请实施例提供的一种推理方法的流程示意图,应用于推理系统,该推理系统包括服务端和多个客户端,具体可以是图1所示的推理系统。
在该实施例中,以多个客户端中的一个为例,该实施例具体包括:
步骤201.服务端向多个客户端发送任务同步信息,以使得多个客户端根据任务同步信息从本地数据中选择出第一网络数据,任务同步信息表示对第一网络数据的时间要求。
任务同步信息的作用是使得多个客户端从本地数据中选择与目标网络资源有关的第一网络数据,以在满足用户需求的前提下进行推理。其中,目标网络资源可以是目标网络当前的状态,例如目标网络资源可以是车辆的故障状态,则第一网络数据可以包括以下至少一项:第一车辆的定位数据、行驶速度数据、行驶里程数据、电池电量数据、电池电压数据、刹车数据、油门数据、电机的电压和电流数据、绝缘电阻数据和温度数据等,本申请实施例对处不作限定。
本申请实施例对任务同步信息的内容不做具体限定,例如,任务同步信息可以是某一具体时刻,以使得多个客户端任务同步信息从本地数据中选择出的第一网络数据的时刻相同,或者说,使得多个客户端任务同步信息从本地数据中选择出的第一网络数据的时刻比较接近,或者说,以使得多个客户端任务同步信息从本地数据中选择出的第一网络数据的时刻之前一段时间数据条数维度降维后的数据,如多条数据的平均值、中值或者通过主成分分析等降维得到的主成分值、聚类的聚类中心值等。
第一联邦学习客户端为多个客户端中的一个,第一联邦学习客户端可以接收来自服务端的任务同步信息,任务同步信息表示对该第一联邦学习客户端的本地数据中的第一网络数据的时间要求。
步骤202.第一联邦学习客户端根据任务同步信息从本地数据中选择出第一网络数据。
本实施例中,该任务同步信息指示一个时间点,第一联邦学习客户端可以从本地数据中选择该时间点的数据作为第一网络数据。
第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第三时隙在第一时隙之前。
例如,任务同步信息包括10点10分这一时刻,那么第一联邦学习客户端可以根据任务同步信息选择10点10分这一时刻的本地数据作为第一网络数据,或者,第一联邦学习客户端可以根据任务同步信息选择与10点10分这一时刻较为接近的时刻的本地数据或本地数据沿数据维度降维后的数据作为第一网络数据,例如,第一联邦学习客户端可以选择10点9分至10点11分内的本地数据作为第一网络数据。
步骤203.第一联邦学习客户端从第一网络数据中提取第一网络数据的特征,第一网络数据的特征包含第一局部特征。
第一网络数据的特征可以包括一个特征,也可以包括多个特征;当第一网络数据的特征包括多个特征时,局部特征可以是多个特征中的一个或多个。
第一网络数据的特征一般用特征向量表示,以第一网络数据的特征包括三个特征为例,第一网络数据的特征可以表示为其中,局部特征可以为这三个特征中的一个或多个,传输哪个或哪些特征可以事先约定,也可以根据当前第一网络数据特性、第一网络数据的特征特性、客户端计算资源,客户端和/或服务器端网络资源决定,决定的策略可以是人工写的逻辑也可以是通过机器学习得到的模型,本申请实施例以为例,其中,n是第一联邦学习客户端的标识,可以为数字1、2、3等,不同的n指代不同的客户端。下文也将第一网络数据的特征称为隐含特征。第一网络数据特性指的原始数据特性,比如原始数据信息量少则直传一个特征,如果原始数据信息量大则传多个特征。原始数据的信息量可以用熵来计算,也可以通过一个神经网络或者机器学习模型来计算。
第一联邦学习客户端可以通过局部特征提取模型提取第一网络数据的特征,其中,局部特征提取模型的种类包括多种,本申请实施例对此不做具体限定。
作为一种可实现的方式,当第一网络数据特征为多个类型时,局部特征提取模型可以包括多个第二任务模型,相应地,步骤203具体可以是:第一联邦学习客户端将第一网络数据分别输入到多个第二任 务模型中,以得到多个第二任务模型输出的第一网络数据的多个特征。即第一联邦学习客户端将第一网络数据按照类型分别输入到不同的第二任务模型中,各个第二任务模型分别输出该类型对应的特征。
本申请实施例对第二任务模型的种类不做具体限定。
作为一种可实现的方式,每个第二任务模型是一层自编码器,第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示第二任务模型的数量。
具体地,自编码器是一个输入和学习目标相同的神经网络,其结构分为编码器和解码器两部分。输入第一网络数据后,由编码器输出的隐含特征,即“编码特征(encoded feature)”可视为第一网络数据的表征。
其中,自编码器可以采用深度神经网络(deep neural network,DNN)、卷积神经网络(convolutional neural network,CNN)或者Transformer神经网络实现。
具体地,以3个第二任务模型为例,即局部特征提取模型包含3层自编码器。第1层的自编码器输入为原始特征Xn,通过编码器得到隐含特征编码器可以采用深度神经网络实现。隐含特征的维度比原始特征Xn的维度低。隐含特征通过与编码器对应的解码器得到原始特征Xn的重建特征解码器可以采用深度神经网络。
第2层的自编码器输入为原始特征Xn,通过编码器得到隐含特征编码器可以采用深度神经网络实现。隐含特征的维度比原始特征Xn的维度低。隐含特征通过与编码器对应的解码器得到原始特征Xn与第1层模型重构的差异的重建特征
第3层的自编码器输入为原始特征Xn,通过编码器得到隐含特征编码器可以采用深度神经网络实现。隐含特征h3的维度比原始特征Xn的维度低。隐含特征h3通过与编码器E3对应的解码器D3得到原始特征Xn与第1层模型重构和第2层模型重构之和的差异的重建特征
其中,该自编码器的预训练过程可以是:
(1)联邦学习服务端下发第n层的自编码器模型;(2)客户端接收第n层的自编码器模型,使用本地数据训练模型若干轮后上传第n层的自便器模型到联邦学习服务端;(3)重复(1)到(2)直到第n层的自编码器收敛;(4)重复(1)到(3)逐层训练各层的自编码器。
步骤204.第一联邦学习客户端向服务端发送第一局部特征,第一局部特征是从第一网络数据中提取出的,第一联邦学习客户端为多个客户端中的任意一个。相应地,服务端接收来自多个客户端的局部特征,局部特征是从第一网络数据中提取出的。
本实施例中,第一联邦学习客户端在从第一网络数据的特征中选择除第一局部特征后,可以发送给服务端,其他客户端也可以相应从本地的第一网络数据中提取局部特征发送给服务端,示例性的,第二联邦学习客户端从本地的第一网络数据的特征中选择第二局部特征发送给服务端。
步骤205.服务端根据来自多个客户端的局部特征,得到全局先验特征。
本实施例中,联邦学习服务端采集多个客户端上报的局部特征,然后可以对这些局部特征进行处理,基于多个客户端的特征确定全局先验特征,该全局先验特征通过多客户端信息融合获得,可以提高该全局先验特征的准确度。
其中,联邦学习服务端接收多个客户端上传的局部特征将多个客户端上传的局部特征拼接成全局先验特征提取模型的输入特征,输入全局先验特征提取模型,得到输出的各客户端对应的全局先验特征。
全局先验特征提取模型采用DNN、CNN、Transformer或者RNN模型,模型的输入除了拼接后的局部特征还有上一次的输出值。全局先验特征提取模型也可以使用查找表的方式实现,例如将拼接后的局部特征作为key,对应的全局先验特征作为value。
拼接指的是沿特征的某一个维度或者多个为将数据级联在一起。例如是一个D×W×H的3维张量,如果W的维度拼接则拼接后的维度D×NW×H的3维张量。例如是一个D×W×H的3维张量,如果W和H的维度拼接则拼接后的维度D×N1W×N2H的3维张量,其中N1N2=N。
由于客户端的数据只在分组内有影响,组间没有影响。如果将所有客户端的数据一起确定全局先验 特征会影响组间的输出,所以需要分组输入。作为一种可实现的方式,第一联邦学习客户端还可以向服务端发送分组信息,分组信息指示局部特征所在的分组,以使得服务端根据来自多个客户端的局部特征以及局部特征所在的分组得到全局先验特征。
其中,第一联邦学习客户端上传的第一局部特征包括第一子特征和第二子特征,第二联邦学习客户端上传的第二局部特征包括第三子特征和第四子特征,来自第一联邦学习客户端的分组信息可以指示第一子特征所在的分组和第二子特征所在的分组,来自第二联邦学习客户端的分组信息指示第三子特征所在的分组和第四子特征所在的分组,假设第一子特征所在的分组和第三子特征所在的分组相同,则联邦学习服务端可以先对第一子特征和第三子特征进行处理,得到中间特征,再按照分组,将中间特征与第二子特征进行处理后获得的处理结果与第四子特征进行处理,以获得全局先验特征。对于不同类型的客户端上传的特征可以分开处理,减少不同类型客户端之间的影响。
具体的,该分组信息为对客户端之间的分组的信息,组内对应的是相同土地性质(农田、道路、城镇)的客户端,组间是不同土地性质的客户端。不同的土地性质对信号的传输影响不同,因此进行分组处理减少不同土地客户端之间的影响。例如高速路上,处于同一方向或者同一条车道上的车辆(1、2、3)的数据为同一分组,不同方向的车辆(4、5、6)的数据为另一分组,则车辆(1、2、3)上传的分组信息指示同一分组,车辆(4、5、6)上传的分组信息指示同一分组。
联邦学习服务端接收多个客户端上传的局部特征和对应的分组信息分组信息可以通过对步骤103中的分别进行聚类得到。
相同的拼接在一起,根据全局先验特征提取模型采用不同的模型,如DNN、CNN、RNN、Transformer,对应不同的拼接方式:
对于DNN、CNN这类输入数据需要同时输入的模型,对按全局位置拼接,未被该组选中的位置用默认值代替,如下图所示。例如,各客户端在全局位置如下表1所示:
表1
如果属于同一个分组;一个分组;一个分组则,则数据拼接后为表2、表3和表4所示:
表2
表3
表4
如果采用RNN和Transformer模型,输入数据可以逐一输入到模型。可以将属于同一组的添加全局位置编码然后输入到模型。全局位置编码如下图所示。

输入模型的序列为
将拼接好的局部特征输入全局先验特征提取模型的第一层模型,得到第一层的隐含特征表示第一层第m组的隐含特征。
将全局先验特征提取模型的第一层模型输出的隐含特征与局部特征进行拼接,如下表5所示:
表5
将拼接后的局部特征按第一层的方式分组输入全局先验特征提取模型的第二层模型得到第二层每组的第二层的分组可以是在第一层的分组基础上进行再分组,也可以是对全部局部特征进行重新分组。
将全局先验特征提取模型的第二层模型输出的隐含特征与局部特征进行拼接,讲拼接的结果输入全局先验特征提取模型的第三层模型,得到输出的各客户端对应的全局先验特征
步骤206.联邦学习服务端向多个客户端分别发送全局先验特征,相应的,第一联邦学习客户端接收全局先验特征。
本实施例中,联邦学习服务端在确定全局先验特征后,即可向连接的多个客户端下发该全局先验特征。
步骤207.第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果。
本实施例中,第一联邦学习客户端可以基于该全局先验特征对本地的第二网络数据进行推理,获得第二网络数据的推理结果。其中,第二时隙与第一时隙相同或在第一时隙之后,即第二网络数据可以与第一网络数据处于相同时刻,也可以处于第一网络数据的时刻之后的数据,示例性的,假设第一网络数据为9:00的数据,则第二网络数据可以是9:00到9:15之间任意时刻的数据,本申请实施例对此不作限定。
推理的过程中可以直接对第二网络数据处理,也可以对从第二网络数据中提取的隐含特征进行推理,还可以是从隐含特征中选择一个或多个特征进行推理,本申请实施例对此不作限定。
全局先验特征为特征向量或推理模型,推理模型用于根据第二网络数据输出第二网络数据的推理结果。
对于全局先验特征为特征向量的场景,第一联邦学习客户端根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,第二机器学习模型用于第二网络数据的推理。具体的,第一联邦学习客户端需要将全局先验特征和第二网络数据输入到本地的第二机器学习模型中进行推理,将该第二机器学习模型输出的结果作为推理结果。
其中,在将第二网络数据输入到第二机器学习模型中之前,第一联邦学习客户端还可以将第二网络数据输入到第三机器学习模型(例如局部特征提取模型),以获得可以体现第二网络数据特点的多个特征,可以节省计算资源。第一联邦学习客户端将全局先验特征和第二网络数据输入到本地的第二机器学习模型中进行推理的方式可以是,第一联邦学习客户端将全局先验特征输入到该第二机器学习模型中,可以获得第二机器学习模型输出的第二网络数据的多个特征对应的权重,即每个特征对应一个权重,然后即可根据第二网络数据的多个特征以及第二网络数据的多个特征各自的权重确定推理结果。
图3为本申请实施例提供的一种推理结构示意图,如图3所示,客户端根据全局先验特征计算不同隐含特征对应的权重值,具体的,局部特征提取模型采用多层自编码器。
客户端Cn接收联邦学习服务端下发的全局先验特征推理模型(第二机器学习模型)采用DNN模 型,客户端Cn将全局先验特征输入推理模型,推理模型输出多个隐含特征的对应的权重值并保存权重值
客户端Cn通过采集器获取本地t时刻的待推理数据X(t)n,经待推理数据X(t)n通过局部特征提取模型输出的多个隐含特征根据局部特征提取模型输出的多个隐含特征推理模型输出的权重值和本地的锚点数据集计算分类到各锚点的距离r(t)n,m,其中
选取全部锚点数据集对应的r(t)n,m最小的值对应的yn,m作为X(t)n的输出。
计算局部特征提取模型输出的多个隐含特征分别对应的分类结果作为输出稳定性的评价指标,即


选取全部锚点数据集对应的最小的值对应的yn,m作为X(t)n隐含特征的类别,选取全部锚点数据集对应的最小的值对应的yn,m作为X(t)n隐含特征的类别选取全部锚点数据集对应的最小的值对应的yn,m作为X(t)n隐含特征的类别。
输出稳定性的评价指标=相同类别最大数/隐含特征向量总数。
如果三个类别一致则稳定性的评价指标=1,如果两个类别一致则稳定性的评价指标=2/3,如果类别都不一致则稳定性的评价指标=1/3。
上面讲述的第二机器学习模型是一个任务模型,该任务模型不需要对第二网络数据进行处理,下面对任务模型需要对第二网络数据进行处理的方式进行描述,其中,第二机器学习模型包括多个第一任务模型,可以对第二网络数据的多个特征分别进行处理,具体的,第一联邦学习客户端获得全局先验特征后,可以从该全局先验特征确定多个第一任务模型各自的权重,第一联邦学习客户端可以将第二网络数据的特征输入到多个第一任务模型中,该多个第一任务模型的输入特征可以是默认类型的特征,即第二网络数据的特征按照类别分类,并输入到对应类别的第一任务模型,第一联邦学习客户端即可以对多个第一任务模型输出的推理特征结合多个第一任务模型对应的权重进行加权平均,以获得推理结果。
其中,第一联邦学习客户端根据全局先验特征确定多个第一任务模型各自的权重的方式可以是,第一联邦学习客户端需要结合全局先验特征和第二网络数据,来计算多个第一任务模型各自的权重,即结合需要推理的数据的特征计算权重,提高权重的准确度。
其中,在将第二网络数据输入到第二机器学习模型中之前,第一联邦学习客户端还可以将第二网络数据输入到第三学习模型(例如局部特征提取模型),以获得可以体现第二网络数据特点的多个特征,可以节省计算资源。
图4为本申请实施例提供的另一种推理结构示意图,请参阅图4,客户端Cn接收联邦学习服务端下发的全局先验特征并保存全局先验特征其中,推理模型采用缓和专家模型,该混合专家模型包括模型选择器和多个第一任务模型,如图中的任务模型1至任务模型N。
客户端Cn通过采集器获取本地t时刻的待推理数据X(t)n,推理模型(第二机器学习模型)采用混合专家模型,将待推理数据X(t)n和全局先验特征输入混合专家模型计算推理结果。
其中,混合专家模型由模型选择器、多个任务模型组成的任务模型组构成和分类器组成:模型选择器输入为全局先验特征(也可以加入第二网络数据),输出为对“任务模型组”中的多个任务模型的权重(选择可是为一种0-1化的权重);“任务模型组”由N个“任务模型”组成,每个“任务模型”根据输入数据,输出为隐含特征向量;分类器输入的是经过模型选择器输出的权重加权求和后的隐含特征向量,输出的分类结果。
具体步骤如下:
第一步:全局先验特征(也可以加入第二网络数据)通过“模型选择器”计算多个任务模型的权重值。
全局先验特征首先输入模型选择器模型(模型可以用DNN、CNN、Transformer实现),模型参数用表示。模型选择器模型的输入-输出关系表示为:或者
然后计算g(t)′到各个锚点的距离距离可以使用欧式距离、余弦距离或者自定义的距离。模型选择器模型的锚点集合为:
其中表示输入第m个任务模型对应的第k个锚点。表示第m个任务模型对应的锚点集合。
然后选取每个任务模型对应的锚点的集合中的各锚点到g(t)′的最佳距离最佳距离与具体计算距离的方式有关,如果使用欧式距离,最佳距离为欧式距离最小值的倒数;如果使用余弦距离,最佳距离为余弦距离最大值。
最后计算“模型选择器”输出的权重向量g=[g1 g2 … gn,m … gm,M],gn,m表示对第m个任务网络的权重
第二步:计算本地原始特征经过“模型选择器”选中的任务模型的隐含特征提取值。第m个任务网络的参数用表示,第m个任务网络的输入-输出关系表示为:
第三步:“加权求和器”将“模型选择器”输出的M个权重对“任务模型组”的M个隐含特征向量进行加权求和(未选中的任务模型的隐含特征可不参与计算),
第四步:分类器采用DNN实现,模型参数为输入-输出关系为:
本实施例中,联邦学习服务端下发的全局先验特征还可以直接是混合专家模型中的模型选择器,此处不作限定,局部特征提取模型也可以是混合专家模型。局部特征提取模型(第三机器学习模型)可以包括多个第二任务模型,则第一联邦学习客户端通过第三机器学习模型提取第二网络数据的特征的方式可以是,第一联邦学习客户端先根据本地数据或者第二网络数据确定多个第二任务模型对应的权重,然后将第二网络数据输入到第二任务模型中,可以获得第二任务模型输出的第二网络数据的子特征,第一联邦学习客户端即可根据多个第二任务模型对应的权重以及第二任务模型输出的子特征进行加权平均,以获得第二网络数据的特征。
图5为本申请实施例提供的另一种推理结构示意图,请参阅图5,联邦学习服务端生成任务同步信息,并下发任务同步信息给多个客户端。
客户端Cn接收到联邦学习服务端下发的任务同步信息,并从本地数据中找到满足任务同步信息的数据Xn。其中,满足任务同步信息指的是数据Xn的采集时间与任务同步信息的时间最为接近。
客户端Cn通过本地的局部特征提取模型得到隐含特征其中作为客户端Cn的局部特征。局部特征提取模型混合专家模型实现,即由多个任务模型i和一个模型选择器组成。数据通过模型选择器选择任务模型进行特征提取,得到局部特征
任务模型i采用解耦表征学习训练得到,即任务模型输出的局部特征可以分成K个组,组内特征具有相似性,组间特征具有特异性。不同的神经网络结构得到的的类型不同,因此可以是一个标量,也可以是一个向量,一个距离或者一个张量,具体如下:
局部特征提取器采用全连接神经网络或者RNN,如是一个标量,其中W1,W2为模型参数,是一个矩阵;如果就是一个向量, 其中和W1是模型参数,是矩阵。引入张量计算的表达W2是一个三维的张量,是张量计算;W2还可以是更高为的张量,此时得到的可以是一个矩阵或者张量。
局部特征提取器采用卷积神经网络或者Transformer神经网络,则是一个矩阵,以图像为例,图像经过CNN的卷积层之后得到此时表示是卷积核m对应的第m个通道的特征图,是一个矩阵。如果将一个通道内的数据进行求平均操作,则就是一个标量。
组内特征具有相似,组间特征具有特异性,相似性和特异性可以通过度量之间的距离得到,如果m1和m2属于同一个特征组,则他们的距离值小,如果他们属于不同的特征组,则他们的距离值大。
客户端Cn上传到联邦学习服务端,相应的,联邦学习服务端接收多个客户端上传的局部特征联邦学习服务端可以将多个客户端上传的局部特征拼接成全局先验特征提取模型的输入特征,输入全局先验特征提取模型,输出全局先验特征,全局先验特征为各客户端推理模型的任务选择器。并将各客户端对应的全局先验特征(推理模型的任务选择器)发送给对应的客户端Cn
客户端Cn接收联邦学习服务端下发的全局先验特征(推理模型的任务选择器),通过采集器获取本地t时刻的待推理数据X(t)n,讲待推理数据X(t)n得到局部特征h(t)n,将局部特征h(t)n输入推理模型得到推理结果。
推理模型采用混合专家模型,即由模型选择器和任务模型组成。模型选择器为全局先验特征(推理模型的任务选择器)。每个任务模型对应局部特征h(t)n的一个特征组,每个特征组中的特征具有相似性。即任务模型1的输入是h(t)n的第1组特征,任务模型2的输入是h(t)n的第2组特征,任务模型2的输入是h(t)n的第n组特征。
全局先验特征也可以直接是推理模型,例如一个带有模型参数的分类器或者回归器其中z表示分类器或者回归器的输入,表示分类器或者回归器模型参数。各客户端的分类器或者回归器可以相同,也可以不同。该可以是
图6为本申请实施例提供的一种计算图的示意图,如图6所示,全局先验特征可以是一个带有一组特征值以及计算逻辑关系的计算图,以为例,其中表示第n个客户端的第k个特征向量,特征向量可以作为模型参数的一部分在训练的过程中通过数据学习,也可以是人工设置的值;Cn表示第n个客户端的计算流图。各客户端的计算图可以相同,也可以不同。计算图可以通过数据训练得到,也可以通过知识图谱得到。
左图表示输入的特征向量v与计算图中的特征值计算预先距离,即v与中的每一个值计算余弦距离得到


然后计算后面的逻辑操作
右图在计算计算余弦距离后与阈值比较,大于阈值的为1,小于阈值的为0,即


然后计算逻辑操作,这里的加对应的是逻辑里面的或操作,乘对应的是逻辑里面的与操作:
全局先验特征可以是知识图谱,以图像识别为例。局部特征提取器采用解耦表征学习训练得到,即局部特征提取器得到的隐含特征可以分组成不同自概念的学习,如车轮、车门、车窗、车灯、人的眼睛、人的鼻子、人的嘴等,即局部特征提取器得到的隐含特征h={h(1)=车轮,h(2)=车门,h(3)=车窗,h(4)=车灯,h(5)=人的眼睛,h(6)=人的鼻子,h(7)=人的嘴}。
图7为本申请实施例提供的一种知识图谱的示意图,如图7所示,此时局部特征提取器在参与推理时不但要计算特征的值还要计算特征的位置。激活表示局部特征提取器某个通道的值之和或通道的组合或者通道内某个像素值大于阈值。
当客户端直接使用已经训练好的局部特征提取模型和推理模型时:
客户端Cn接收联邦学习服务端下发的全局先验特征,计算图或者分类器/回归器或者知识图谱,本地保存计算图或者分类器/回归器或者知识图谱。
客户端Cn通过采集器获取本地t时刻的待推理数据X(t)n,将待推理数据X(t)n通过局部特征提取模型输出的多个隐含特征然后将输入计算图或者分类器/回归器或者知识图谱得到输出结果。
当客户端还需要对推理模型进行重新训练时:第一联邦学习客户端利用样本数据对第一机器学习模型进行训练,提取第二网络数据的特征,然后将第二网络数据的特征输入到训练后的第一机器学习模型中,以得到训练后的第一机器学习模型输出的推理结果。
具体的,客户端Cn接收联邦学习服务端下发的全局先验特征,计算图或者分类器/回归器或者知识图谱,初始化本地推理局部特征提取模型参数,使用或者分类器/回归器或者知识图谱作为分类器或者回归器,组成推理模型,再使用本地数据训练推理模型并保存训练结束的推理模型其中,此处的本地数据为历史数据和对应的历史推理结果。
客户端Cn通过采集器获取本地t时刻的待推理数据X(t)n,待推理数据X(t)n通过推理模型得到输出结果。
联邦学习服务端确定全局先验特征除了需要第一局部特征和第二局部特征外,还可以结合来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征进行确定。联邦学习服务端可以计算当前推理过程的局部特征与多组历史局部特征的相似度,即确定第一局部特征与各组历史局部特征中的来自第一联邦学习客户端的历史局部特征的相似度,以及确定第二局部特征与各组历史局部特征中的来自第二联邦学习客户端的历史局部特征的相似度,然后基于相似度对多组历史局部特征对应的历史先验特征进行加权求和,获得所需的全局先验特征。
在一个可行的实施方式中,多组历史局部特征还可以具有标签,该标签是人为标注的每组历史局部特征的实际结果,可以用来和推理结果比较,来确定推理结果的准确度。客户端在获得推理结果后,可以上传给联邦学习服务端,联邦学习服务端可以根据多个客户端的推理结果确定当前全局先验特征对应的目标推理结果,则可以从多组历史局部特征中选择标签为该目标推理结果的目标组历史局部特征,如果当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值,则可以根据当前推理过程的局部特征与该目标组历史局部特征的相似度对所述目标组历史局部特征进行更新,如果当前推理过程 的局部特征与目标组历史局部特征的相似度小于该阈值,则可以将当前推理过程的局部特征作为新的一组历史局部特征保存在本地。
图8为本申请实施例提供的另一种推理结构示意图,请参阅图8,联邦学习服务端接收多个客户端上传的局部特征通过全局先验模型计算局部特征与样例库中的的相似度,并基于形似度对对应的hg(m)进行加权求和作为的全局先验特征。
联邦学习服务端的样例库中为表示第m条样例数据对应的历史局部特征集合,hg(m)表示第m条样例数据对应的历史全局先验特征。
相似度可以采用欧式、余弦距离或者训练的神经网络来度量,以余弦距离为例:
加权求和为:
联邦学习服务端下发全局先验特征hg给各个客户端,各客户端获取待推理数据,结合全局先验特征hg,得到本地推理结果。然后上传推理结果到联邦学习服务端,如果客户端有事后观测标注功能,则上传事后标注的标签到联邦学习服务端。
联邦学习服务端接收客户端上传的推理结果或者标注的标签,统计对应的类别数量最多的类y,从中找到全部类别相同的样例,即y(m)与y相同。计算和类别相同的类的样例的形似度,如欧式距离、余弦距离或者神经网络,对于相似度大于阈值的样例数据,根据相似度更新对应的hg(m),如果全部选中的样例都不满足阈值,则将作为一个新的样例。
以余弦距离为例:
相似度:
hg(m)的更新:hg(m)=r(m)*hg+(1-r(m))*hg(m)。
客户端在获得推理结果以及人为标注的标签后,还可以对图3中全局先验模型和局部特征提取模型、推理模型进行训练,具体的,各客户端计算推理结果和标签的误差,使用误差反向传播计算全局先验特征的误差,并将全局先验特征的误差上传给联邦学习服务端。联邦学习服务端接收各客户端上传的全局先验特征的误差,使用误差反向传播计算局部特征的误差,并下发给各客户端。各客户端和联邦学习服务端使用各自的误差计算模型梯度并更新模型,其中,客户端根据推理结果和标签的误差更新推理模型,根据局部特征的误差更新局部特征提取模型,联邦学习服务端根据全局先验特征的误差更新全局先验局部特征提取模型。
联邦学习服务端还可以保存多个客户端的部分数据,将该部分数据作为联邦学习服务端的本地数据,并基于联邦学习服务端的本地数据训练推理初始模型,具体此处不作限定。
图9为本申请实施例提供的一种联邦学习服务端的本地数据的获取方式示意图,如图9所示,联邦学习服务端使用本地数据训练“差异性判决模型”并下发到客户端侧。客户端侧使用联邦学习服务端下发的“差异性判决模型”计算差异性指标,并对差异性指标进行抽样,将抽样后的差异性指标上传到联邦学习服务端。联邦学习服务端收集差异性指标进行聚类,以获得全局差异性指标类中心,然后将该全 局差异性指标类中心下发给客户端。
客户端侧接收云侧下发的差异性指标类型中心,将客户端侧本地数据向各类中心进行归类并统计归类结果,并统计各类中心中数据个数,以及远离类中心的数据的个数和重要数据的个数,将统计结果上传到联邦学习服务端。其中客户端侧本地数据包括普通数据和重要数据,该重要数据可以是入侵数据、错例数据、投票不一致数据、或离锚点远的数据,此处不作限定。
联邦学习服务端接收统计结果并结合本地数据生成采集策略(各客户端侧可不同),并下发给各客户端。
客户端侧接收联邦学习服务端下发的采集策略并采集数据,对采集的数据进行抽样,将抽样后的数据上传到联邦学习服务端。联邦学习服务端采用主动学习的方式对上传的数据打标签并下发给各客户端侧。
联邦学习服务端使用本地数据训练打标签模型并下发客户端侧。客户端侧使用该打标签模型对本地数据打标签,然后根据该打标签的本地数据和联邦学习服务端下发的有标签的数据对本地的推理模型进行训练。
联邦学习服务端还可以对联邦学习服务端的本地数据进行压缩,如采用训练数据生成器对数据进行压缩。且联邦学习服务端还可以基于本地数据训练推理模型作为客户端的初始推理模型,具体此处不作赘述。
对于图4中的混合专家模型的训练,则需要按照固定参数任务选择器生成、任务模型选择性下发、模型预训练、模型训练和模型按策略训练的方式进行训练。
图10为本申请实施例提供的一种固定参数任务选择器生成的流程示意图,具体请参阅图10:
步骤1001.联邦学习服务端按照类别生成聚类的K1个类中心信息。
步骤1002.联邦学习服务端下发K1个类中心信息到客户端。
步骤1003.客户端接收K1个聚类中心信息,将K类数据按聚类中心信息归到K1个聚类中心信息,统计每个聚类中心所属数据的均值和个数,并上传联邦学习服务端。
步骤1004.客户端将每个聚类中心所属数据的均值和个数上传联邦学习服务端。
步骤1005.联邦学习服务端计算全局的每个聚类中心所属数据的均值得到新的聚类中心信息。
步骤1006.联邦学习服务端判断新的聚类中心信息是否收敛,如果没有收敛,则转到步骤1002,如果收敛则转到步骤1007。收敛的条件可以是新的聚类中心信息较前一次的聚类中心信息的变化量小于阈值,或者是达到迭代次数。
步骤1007.联邦学习服务端下发新的K1个聚类中心信息。
步骤1008.客户端统计K个类的数据分别对应在K1个聚类中心信息的个数。
步骤1009.客户端上传K个类的数据分别对应在K1个聚类中心信息的个数到联邦学习服务端。
步骤1010.联邦学习服务端接收各客户端上传的K个类的数据分别对应在K1个聚类中心信息的个数,生成固定参数任务选择器规则。
示例性的,方案1:对于需要隐私保护的场景:
客户端n上传的K个类的数据分别对应在K1个聚类中心信息的个数为矩阵An
表示第n个客户端上传的第k个类的属于第k1个聚类中心信息。
第一步:计算全局的K个类别数据对应到K1个聚类中心信息的A:

其中表示第k个类对应到K1个聚类中心信息的全局数据量。
第二步:为每个类别选择数量最多的topK1对应的聚类中心信息对应的聚类编号作为该类选择的专家。
其中bk=[bk,1 ... bk,k1]中有topK1个值不为零,bk中不为零的位置为ak中topK1对应的位置。bk中不为零的位置的值可以是1,也可以是数据量的加权值,加权值为:
在此基础上为了满足专家之间的任务均衡性,设置每个专家最多可以用topK2次,当专家被选中的次数超过topK2时,则该专家不再在topK1的选择时被选中。专家选择的顺序可以按专家被选中的数据量排序,从数据量最多的专家开始,比如ak,k1的值最大,则从第k个类开始选专家。
B作为固定参数任务选择器规则。
方案2:对于每个用户选中的专家个数尽可能少作为优化目标的场景:
第一步:对An对每类中属于不同聚类中心信息的数据进行归一化得到每类数据属于不同聚类中心信息的概率:
其中
第二步:将进行K3个聚类中心信息的聚类,属于同一个聚类中心信息的分在一个组中。
第三步:与方案1相同,只是限制每个组中的专家个数不同超多topK4,如果超过则只在topK4对应的专家中选择数据量最大的专家作为对应类别的专家。
第四步:对于组间不一致的类别和专家的匹配关系按类别和专家的匹配最多的方式,例如假设有10组,若其中专家2对应的组数最多,则选择专家2。
图11为本申请实施例提供的一种任务模型选择的流程示意图,请参阅图11:
步骤1101.联邦学习服务端下发成固定参数任务选择器规则。
步骤1102.客户端接收固定参数任务选择器规则,统计本地数据对应的任务模型ID号。
步骤1103.客户端将任务模型ID上传联邦学习服务端。
步骤1104.联邦学习服务端匹配各客户端任务模型ID并匹配对应的任务模型和参数可训练的任务模型选择器。
图12为本申请实施例提供的一种模型预训练的流程示意图,请参阅图12:
步骤1201.联邦学习服务端下发任务模型和参数可训练的任务模型选择器。
步骤1202.客户端接收参数可训练的任务模型选择器,统计本地数据对应的任务模型ID号。
步骤1203.客户端将任务模型ID上传联邦学习服务端。
步骤1204.联邦学习服务端匹配各客户端任务模型ID并匹配对应的任务模型。
步骤1205.联邦学习服务端下发任务模型。
步骤1206.客户端使用本地数据,使用联邦学习服务端的任务模型作为客户端的任务模型的初始值,通过固定参数任务选择器选择任务模型,并训练客户端的任务模型若干轮。
步骤1207.客户端使用本地数据,将联邦学习服务端的参数可训练的任务模型选择器作为客户端的参数可训练的任务模型选择器的初始值,通过固定参数任务选择器对数据打伪标签,然后训练参数可训练的任务模型选择器若干轮。
步骤1208.客户端上传更新后的客户端的任务模型和参数可训练的任务模型选择器。
步骤1209.联邦学习服务端接收各客户端更新后的客户端的任务模型和参数可训练的任务模型选择器,加权平均得到联邦学习服务端的任务模型和参数可训练的任务模型选择器。
步骤1210.重复步骤1201到步骤1209若干次。
图13为本申请实施例提供的一种模型训练的流程示意图,请参阅图13:
步骤1301.联邦学习服务端下发联邦学习服务端的参数可训练的任务模型选择器。
步骤1302.客户端接收联邦学习服务端的参数可训练的任务模型选择器,统计本地数据对应的任务模型ID号。
步骤1303.客户端上传任务模型ID号到联邦学习服务端。
步骤1304.联邦学习服务端接收各客户端任务模型ID并匹配对应的联邦学习服务端的任务模型。
步骤1305.联邦学习服务端下发对应的联邦学习服务端的任务模型。
步骤1306.客户端使用本地数据,将联邦学习服务端的任务模型作为初始模型,通过联邦学习服务端的参数可训练的任务模型选择器选择任务模型并训练客户端的任务模型N轮。
步骤1307.客户端并上传更新的客户端的任务模型到联邦学习服务端。
步骤1308.联邦学习服务端接收各客户端更新后的客户端的任务模型,加权平均得到联邦学习服务端的任务模型。
步骤1309.联邦学习服务端并下发更新的联邦学习服务端的任务模型到客户端。
步骤1310.客户端使用本地数据,将联邦学习服务端的任务模型作为客户端的任务模型,将联邦学习服务端的参数可训练的任务模型选择器作为客户端的参数可训练的任务模型选择器的初始值,通过客户端的参数可训练的任务模型选择器选择任务模型并训练客户端的参数可训练的任务模型选择器N轮。
步骤1311.客户端并上传更新的客户端的参数可训练的任务模型选择器。
步骤1312.联邦学习服务端接收各更新后的客户端参数可训练的任务模型选择器,加权平均得到联邦学习服务端参数可训练的任务模型选择器。
步骤1313.重复步骤1301到步骤1312若干次。
图14为本申请实施例提供的一种模型按策略训练的流程示意图,请参阅图14:
步骤1401.联邦学习服务端下发联邦学习服务端的参数可训练的任务模型选择器。
步骤1402.客户端接收联邦学习服务端的参数可训练的任务模型选择器,统计本地数据各任务模型处理各类数据的条数。
步骤1403.客户端并上传统计结果到联邦学习服务端。
步骤1404.联邦学习服务端接收各客户端各任务模型处理各类数据的条数,统计全局各任务模型处理各类数据的条数。
步骤1405.联邦学习服务端生成任务模型与各类数据对应关系。
步骤1406.联邦学习服务端下发任务模型与各类数据对应关系。
步骤1407.客户端使用本地数据,使用联邦学习服务端的任务模型作为客户端的任务模型的初始值,通过任务模型与各类数据对应关系选择客户端的任务模型作为学生,将联邦学习服务端的任务模型作为老师模型,通过知识蒸馏训练客户端的任务模型N轮。
步骤1408.客户端使用本地数据,将联邦学习服务端的参数可训练的任务模型选择器作为客户端的参数可训练的任务模型选择器的初始值,通过任务模型与各类数据对应关系对数据打伪标签训练参数可训练的任务模型选择器L6轮。
步骤1409.客户端上传更新后的客户端的任务模型和参数可训练的任务模型选择器。
步骤1410.联邦学习服务端接收各更新后的客户端参数可训练的任务模型选择器,加权平均得到联邦学习服务端参数可训练的任务模型选择器。
步骤1411.重复步骤1401到步骤1410若干次。
本申请实施例中,第一联邦学习客户端采集第一时隙的与目标网络资源有关的第一网络数据,并提取该第一网络数据的第一局部特征发送给联邦学习服务端,联邦学习服务端收集多个客户端上传的局部特征计算全局先验特征,并下发给各客户端,第一联邦学习客户端可以根据该全局先验特征对第二时隙的第二网络数据进行推理,在推理过程中利用了除本地数据外其他客户端的数据的信息,用于提高推理结果的准确性。
上面讲述了推理方法,下面对执行该方法的装置进行描述。
图15为本申请实施例提供的一种推理装置的结构示意图,请参阅图15,该装置150包括:
收发单元1501,用于向联邦学习服务端发送第一局部特征,第一局部特征是从第一网络数据中提取出的,第一网络数据为第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,目标网络资源为第一联邦学习客户端管理的网络资源,接收来自联邦学习服务端的全局先验特征,全局先验特征是根据第一局部特征和第二局部特征得到的,第二局部特征是由第二联邦学习客户端提供;
处理单元1502,用于根据全局先验特征和第二网络数据进行推理,以得到推理结果,第二网络数据为第一联邦学习客户端在第二时隙获取的与目标网络资源有关的数据,推理结果用于管理目标网络资源,其中,第二时隙与第一时隙相同或在第一时隙之后。
其中,收发单元1501用于执行图2方法实施例中的步骤204和步骤206,处理单元1502用于执行图2方法实施例中的步骤207。
可选的,第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第二网络数据为目标网络资源有关的数据在第二时隙的采样值,第三时隙在第一时隙之前。
可选的,全局先验特征为特征向量或第一机器学习模型,第一机器学习模型用于第二网络数据的推理。
可选的,在全局先验特征为特征向量的情况下,处理单元1502具体用于:
根据全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,第二机器学习模型用于第二网络数据的推理。
可选的,处理单元1502具体用于:将第二网络数据输入到第三学习模型中,以得到第三学习模型输出的第二网络数据的多个特征;将全局先验特征输入到第二机器学习模型中,以获得第二网络数据的多个特征各自的权重;根据第二网络数据的多个特征以及第二网络数据的多个特征各自的权重,确定推理结果。
可选的,第二机器学习模型包括多个第一任务模型;处理单元1502具体用于:根据全局先验特征,计算多个第一任务模型各自的权重;将第二网络数据的特征输入到多个第一任务模型中,以得到多个第一任务模型输出的推理特征;根据多个第一任务模型各自的权重以及多个第一任务模型输出的推理特征,得到推理结果。
可选的,处理单元1502具体用于:根据全局先验特征和第二网络数据,计算多个第一任务模型各自的权重。
可选的,处理单元还用于:通过第三机器学习模型提取第二网络数据的特征。
可选的,第三机器学习模型包括多个第二任务模型;处理单元1502具体用于:根据第二网络数据 确定多个第二任务模型各自的权重;将第二网络数据输入到多个第二任务模型中,以得到多个第二任务模型输出的第二网络数据的子特征;根据多个第二任务模型各自的权重和多个第二任务模型输出的第二网络数据的子特征,得到第二网络数据的特征。
可选的,每个第二任务模型是一层自编码器,多个第二任务模型中第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示第二任务模型的数量。
可选的,在全局先验特征为第一机器学习模型的情况下,处理单元1502具体用于:提取第二网络数据的特征;将第二网络数据的特征输入到第一机器学习模型中,以得到第一机器学习模型输出的推理结果。
可选的,在全局先验特征为第一机器学习模型的情况下,处理单元1502具体用于:利用样本数据对第一机器学习模型进行训练;提取第二网络数据的特征;将第二网络数据的特征输入到训练后的第一机器学习模型中,以得到训练后的第一机器学习模型输出的推理结果。
可选的,收发单元1501还用于:向联邦学习服务端发送分组信息,分组信息指示第一局部特征所在的分组,以使得联邦学习服务端根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征。
可选的,收发单元1501还用于:接收来自联邦学习服务端的任务同步信息,任务同步信息用于指示第一时隙;根据任务同步信息从本地数据中选择出第一网络数据。
图16为本申请实施例提供的另一种推理装置的结构示意图,请参阅图16,该装置160包括:
收发单元1601,用于接收来自第一联邦学习客户端的第一局部特征,第一局部特征是从第一网络数据中提取出的,第一网络数据为第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,目标网络资源为第一联邦学习客户端管理的网络资源;
处理单元1602,用于根据第一局部特征和第二局部特征,得到全局先验特征,第二局部特征是由第二联邦学习客户端提供;
收发单元1601,用于向第一联邦学习客户端发送全局先验特征,使得第一联邦学习客户端根据全局先验特征和第二网络数据进行推理,以得到推理结果,第二网络数据为第一联邦学习客户端在第二时隙获取的与目标网络资源有关的数据,推理结果用于管理目标网络资源,其中,第二时隙与第一时隙相同或在第一时隙之后。
其中,收发单元1601用于执行图2方法实施例中的步骤204和步骤206,处理单元1602用于执行图2方法实施例中的步骤205。
可选的,第一网络数据为目标网络资源有关的数据在第一时隙的采样值或者从第三时隙到第一时隙为止的统计值,第二网络数据为目标网络资源有关的数据在第二时隙的采样值,第三时隙在第一时隙之前。
可选的,全局先验特征为特征向量或第一机器学习模型,第一机器学习模型用于第二网络数据的推理。
可选的,收发单元1601还用于:接收来自第一联邦学习客户端的分组信息,来自第一联邦学习客户端的分组信息指示第一局部特征所在的分组;
处理单元1602具体用于:根据第一局部特征、第一局部特征所在的分组、第二局部特征以及第二局部特征所在的分组得到全局先验特征,第二局部特征所在的分组是由来自第二联邦学习客户端的分组信息指示的。
可选的,第一局部特征包括第一子特征和第二子特征,第二局部特征包括第三子特征和第四子特征,来自第一联邦学习客户端的分组信息指示第一子特征所在的分组和第二子特征所在的分组,来自第二联邦学习客户端的分组信息指示第三子特征所在的分组和第四子特征所在的分组,且第一子特征所在的分组和第三子特征所在的分组相同;
处理单元1602具体用于:基于第一子特征所在的分组和第三子特征所在的分组相同,联邦学习服务端对第一子特征和第三子特征进行处理,得到中间特征;根据中间特征、第二子特征、第四子特征、 第二子特征所在的分组以及第四子特征所在的分组,得到全局先验特征。
可选的,处理单元1602具体用于:根据第一局部特征、第二局部特征、来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征,得到全局先验特征。
可选的,处理单元1602具体用于:计算当前推理过程的局部特征与多组历史局部特征的相似度,当前推理过程的局部特征包括第一局部特征和第二局部特征,每组历史局部特征包括一次历史推理过程中来自第一联邦学习客户端的历史局部特征以及来自第二联邦学习客户端的历史局部特征;根据当前推理过程的局部特征与多组历史局部特征的相似度,对多组历史局部特征对应的历史先验特征进行加权求和,以得到全局先验特征。
可选的,多组历史局部特征都具有标签,标签为人为标注的每组历史局部特征的实际结果;处理单元1602还用于:接收来自第一联邦学习客户端的推理结果;根据第一联邦学习客户端的推理结果和第二联邦学习客户端的推理结果确定目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的情况下,根据当前推理过程的局部特征与目标组历史局部特征的相似度对目标组历史局部特征进行更新,目标组历史局部特征为多组历史局部特征中,标签为目标推理结果;在当前推理过程的局部特征与目标组历史局部特征的相似度小于阈值的情况下,在多组历史局部特征的基础上增加一组历史局部特征,增加的一组历史局部特征为当前推理过程的局部特征。
可选的,收发单元1601还用于:向第一联邦学习客户端发送任务同步信息,任务同步信息用于指示第一时隙,以使得第一联邦学习客户端根据任务同步信息从本地数据中选择出第一网络数据。
图17为本申请的实施例提供的计算机设备170的一种可能的逻辑结构示意图。计算机设备170包括:处理器1701、通信接口1702、存储系统1703以及总线1704。处理器1701、通信接口1702以及存储系统1703通过总线1704相互连接。在本申请的实施例中,处理器1701用于对计算机设备170的动作进行控制管理,例如,处理器1701用于执行图2的方法实施例中发送端所执行的步骤。通信接口1702用于支持计算机设备170进行通信。存储系统1703,用于存储计算机设备170的程序代码和数据。
其中,处理器1701可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器1701也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线1704可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图17中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
装置150中的收发单元1501相当于计算机设备170中的通信接口1702,装置150中的处理单元1502相当于计算机设备170中的处理器1701。
本实施例的计算机设备170可对应于上述图2方法实施例中的第一联邦学习客户端,该计算机设备170中的通信接口1702可以实现上述图2方法实施例中的第一联邦学习客户端所具有的功能和/或所实施的各种步骤,为了简洁,在此不再赘述。
图18为本申请的实施例提供的计算机设备180的另一种可能的逻辑结构示意图。计算机设备180包括:处理器1801、通信接口1802、存储系统1803以及总线1804。处理器1801、通信接口1802以及存储系统1803通过总线1804相互连接。在本申请的实施例中,处理器1801用于对计算机设备180的动作进行控制管理,例如,处理器1801用于执行图2的方法实施例中接收端所执行的步骤。通信接口1802用于支持计算机设备180进行通信。存储系统1803,用于存储计算机设备180的程序代码和数据。
其中,处理器1801可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器1801也可以是实现计算功 能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线1804可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图18中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
装置160中的收发单元1601相当于计算机设备180中的通信接口1802,装置160中的处理单元1602相当于计算机设备180中的处理器1801。
本实施例的计算机设备180可对应于上述图2方法实施例中的接收端,该计算机设备180中的通信接口1802可以实现上述图2方法实施例中的接收端所具有的功能和/或所实施的各种步骤,为了简洁,在此不再赘述。
应理解以上装置中单元的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。且装置中的单元可以全部以软件通过处理元件调用的形式实现;也可以全部以硬件的形式实现;还可以部分单元以软件通过处理元件调用的形式实现,部分单元以硬件的形式实现。例如,各个单元可以为单独设立的处理元件,也可以集成在装置的某一个芯片中实现,此外,也可以以程序的形式存储于存储器中,由装置的某一个处理元件调用并执行该单元的功能。此外这些单元全部或部分可以集成在一起,也可以独立实现。这里所述的处理元件又可以成为处理器,可以是一种具有信号的处理能力的集成电路。在实现过程中,上述方法的各步骤或以上各个单元可以通过处理器元件中的硬件的集成逻辑电路实现或者以软件通过处理元件调用的形式实现。
在一个例子中,以上任一装置中的单元可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(application specific integrated circuit,ASIC),或,一个或多个微处理器(digital singnal processor,DSP),或,一个或者多个现场可编程门阵列(field programmable gate array,FPGA),或这些集成电路形式中至少两种的组合。再如,当装置中的单元可以通过处理元件调度程序的形式实现时,该处理元件可以是通用处理器,例如中央处理器(central processing unit,CPU)或其它可以调用程序的处理器。再如,这些单元可以集成在一起,以片上系统(system-on-a-chip,SOC)的形式实现。
在本申请的另一个实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当设备的处理器执行该计算机执行指令时,设备执行上述方法实施例中第一联邦学习客户端所执行的方法。
在本申请的另一个实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当设备的处理器执行该计算机执行指令时,设备执行上述方法实施例中联邦学习服务端所执行的方法。
在本申请的另一个实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中。当设备的处理器执行该计算机执行指令时,设备执行上述方法实施例中第一联邦学习客户端所执行的方法。
在本申请的另一个实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中。当设备的处理器执行该计算机执行指令时,设备执行上述方法实施例中联邦学习服务端所执行的方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以 是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务端,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (49)

  1. 一种推理方法,其特征在于,所述方法包括:
    第一联邦学习客户端向联邦学习服务端发送第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;
    所述第一联邦学习客户端接收来自所述联邦学习服务端的全局先验特征,所述全局先验特征是根据所述第一局部特征和第二局部特征得到的,所述第二局部特征是由第二联邦学习客户端提供;
    所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理所述目标网络资源,其中,所述第二时隙与所述第一时隙相同或在所述第一时隙之后。
  2. 根据权利要求1所述的方法,其特征在于,所述第一网络数据为所述目标网络资源有关的数据在所述第一时隙的采样值或者从第三时隙到所述第一时隙为止的统计值,所述第二网络数据为所述目标网络资源有关的数据在所述第二时隙的采样值,所述第三时隙在所述第一时隙之前。
  3. 根据权利要求1或2所述的方法,其特征在于,所述全局先验特征为特征向量或第一机器学习模型,所述第一机器学习模型用于所述第二网络数据的推理。
  4. 根据权利要求3所述的方法,其特征在于,在所述全局先验特征为所述特征向量的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包括:
    所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,所述第二机器学习模型用于所述第二网络数据的推理。
  5. 根据权利要求4所述的方法,其特征在于,所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:
    所述第一联邦学习客户端将第二网络数据输入到第三学习模型中,以得到所述第三学习模型输出的所述第二网络数据的多个特征;
    所述第一联邦学习客户端将所述全局先验特征输入到所述第二机器学习模型中,以获得所述第二网络数据的多个特征各自的权重;
    所述第一联邦学习客户端根据所述第二网络数据的多个特征以及所述第二网络数据的多个特征各自的权重,确定所述推理结果。
  6. 根据权利要求4所述的方法,其特征在于,所述第二机器学习模型包括多个第一任务模型;
    所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:
    所述第一联邦学习客户端根据所述全局先验特征,计算所述多个第一任务模型各自的权重;
    所述第一联邦学习客户端将第二网络数据的特征输入到所述多个第一任务模型中,以得到所述多个第一任务模型输出的推理特征;
    所述第一联邦学习客户端根据所述多个第一任务模型各自的权重以及所述多个第一任务模型输出的推理特征,得到推理结果。
  7. 根据权利要求6所述的方法,其特征在于,所述第一联邦学习客户端根据所述全局先验特征,计算所述多个第一任务模型各自的权重包括:
    所述第一联邦学习客户端根据所述全局先验特征和所述第二网络数据,计算所述多个第一任务模型各自的权重。
  8. 根据权利要求6或7所述的方法,其特征在于,所述方法还包括:
    所述第一联邦学习客户端通过第三机器学习模型提取所述第二网络数据的特征。
  9. 根据权利要求8所述的方法,其特征在于,所述第三机器学习模型包括多个第二任务模型;
    所述第一联邦学习客户端通过第三机器学习模型提取所述第二网络数据的特征包括:
    所述第一联邦学习客户端根据所述第二网络数据确定所述多个第二任务模型各自的权重;
    所述第一联邦学习客户端将所述第二网络数据输入到所述多个第二任务模型中,以得到所述多个第二任务模型输出的所述第二网络数据的子特征;
    根据所述多个第二任务模型各自的权重和所述多个第二任务模型输出的所述第二网络数据的子特征,得到所述第二网络数据的特征。
  10. 根据权利要求9所述的方法,其特征在于,每个第二任务模型是一层自编码器,所述多个第二任务模型中第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示所述第二任务模型的数量。
  11. 根据权利要求3所述的方法,其特征在于,在所述全局先验特征为所述第一机器学习模型的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包括:
    所述第一联邦学习客户端提取所述第二网络数据的特征;
    所述第一联邦学习客户端将所述第二网络数据的特征输入到所述第一机器学习模型中,以得到所述第一机器学习模型输出的推理结果。
  12. 根据权利要求3所述的方法,其特征在于,在所述全局先验特征为所述第一机器学习模型的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包括:
    所述第一联邦学习客户端利用样本数据对所述第一机器学习模型进行训练;
    所述第一联邦学习客户端提取所述第二网络数据的特征;
    所述第一联邦学习客户端将所述第二网络数据的特征输入到训练后的所述第一机器学习模型中,以得到训练后的所述第一机器学习模型输出的推理结果。
  13. 根据权利要求1至12中任意一项所述的方法,其特征在于,所述方法还包括:
    所述第一联邦学习客户端向所述联邦学习服务端发送分组信息,所述分组信息指示所述第一局部特征所在的分组,以使得所述联邦学习服务端根据所述第一局部特征、所述第一局部特征所在的分组、所述第二局部特征以及所述第二局部特征所在的分组得到全局先验特征。
  14. 根据权利要求1至13中任意一项所述的方法,其特征在于,所述方法还包括:
    所述第一联邦学习客户端接收来自所述联邦学习服务端的任务同步信息,所述任务同步信息用于指示所述第一时隙;
    所述第一联邦学习客户端根据所述任务同步信息从本地数据中选择出所述第一网络数据。
  15. 一种推理方法,其特征在于,所述方法包括:
    联邦学习服务端接收来自第一联邦学习客户端的第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;
    所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征,所述第二局部特征是由第二联邦学习客户端提供;
    所述联邦学习服务端向所述第一联邦学习客户端发送所述全局先验特征,使得所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理所述目标网络资源,其中,所述第二时隙与所述第一时隙相同或在所述第一时隙之后。
  16. 根据权利要求15所述的方法,其特征在于,所述第一网络数据为所述目标网络资源有关的数据在所述第一时隙的采样值或者从第三时隙到所述第一时隙为止的统计值,所述第二网络数据为所述目标网络资源有关的数据在所述第二时隙的采样值,所述第三时隙在所述第一时隙之前。
  17. 根据权利要求15或16所述的方法,其特征在于,所述全局先验特征为特征向量或第一机器学习模型,所述第一机器学习模型用于所述第二网络数据的推理。
  18. 根据权利要求15至17中任意一项所述的方法,其特征在于,所述方法还包括:
    所述联邦学习服务端接收来自所述第一联邦学习客户端的分组信息,所述来自所述第一联邦学习客户端的分组信息指示所述第一局部特征所在的分组;
    所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征包括:
    所述联邦学习服务端根据所述第一局部特征、所述第一局部特征所在的分组、所述第二局部特征以及所述第二局部特征所在的分组得到全局先验特征,所述第二局部特征所在的分组是由来自所述第二联邦学习客户端的分组信息指示的。
  19. 根据权利要求18所述的方法,其特征在于,所述第一局部特征包括第一子特征和第二子特征,所述第二局部特征包括第三子特征和第四子特征,所述来自所述第一联邦学习客户端的分组信息指示所述第一子特征所在的分组和所述第二子特征所在的分组,所述来自所述第二联邦学习客户端的分组信息指示所述第三子特征所在的分组和所述第四子特征所在的分组,且所述第一子特征所在的分组和所述第三子特征所在的分组相同;
    所述联邦学习服务端根据所述第一局部特征、所述第一局部特征所在的分组、所述第二局部特征以及所述第二局部特征所在的分组得到全局先验特征包括:
    基于所述第一子特征所在的分组和所述第三子特征所在的分组相同,所述联邦学习服务端对所述第一子特征和所述第三子特征进行处理,得到中间特征;
    所述联邦学习服务端根据所述中间特征、所述第二子特征、所述第四子特征、所述第二子特征所在的分组以及所述第四子特征所在的分组,得到全局先验特征。
  20. 根据权利要求15至19中任意一项所述的方法,其特征在于,所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征包括:
    所述联邦学习服务端根据所述第一局部特征、第二局部特征、来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征,得到全局先验特征。
  21. 根据权利要求20所述的方法,其特征在于,所述联邦学习服务端根据所述第一局部特征、第二局部特征、来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征,得到全局先验特征包括:
    所述联邦学习服务端计算当前推理过程的局部特征与多组历史局部特征的相似度,所述当前推理过程的局部特征包括所述第一局部特征和所述第二局部特征,每组历史局部特征包括一次历史推理过程中来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征;
    所述联邦学习服务端根据所述当前推理过程的局部特征与多组历史局部特征的相似度,对所述多组历史局部特征对应的历史先验特征进行加权求和,以得到所述全局先验特征。
  22. 根据权利要求21所述的方法,其特征在于,所述多组历史局部特征都具有标签,所述标签为人为标注的每组所述历史局部特征的实际结果;
    所述方法还包括:
    所述联邦学习服务端接收来自所述第一联邦学习客户端的推理结果;
    所述联邦学习服务端根据所述第一联邦学习客户端的推理结果和所述第二联邦学习客户端的推理结果确定目标推理结果;
    在所述当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的情况下,所述联邦学习服务端根据所述当前推理过程的局部特征与所述目标组历史局部特征的相似度对所述目标组历史局部特征进行更新,所述目标组历史局部特征为所述多组历史局部特征中,标签为所述目标推理结果;
    在所述当前推理过程的局部特征与所述目标组历史局部特征的相似度小于所述阈值的情况下,所述联邦学习服务端在所述多组历史局部特征的基础上增加一组历史局部特征,增加的一组历史局部特征为所述当前推理过程的局部特征。
  23. 根据权利要求15至22中任意一项所述的方法,其特征在于,所述方法还包括:
    所述联邦学习服务端向所述第一联邦学习客户端发送任务同步信息,所述任务同步信息用于指示所述第一时隙,以使得所述第一联邦学习客户端根据所述任务同步信息从本地数据中选择出所述第一网络数据。
  24. 一种推理系统,其特征在于,包括:
    第一联邦学习客户端向联邦学习服务端发送第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;
    所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征,所述第二局部特征是由第二联邦学习客户端提供;
    所述联邦学习服务端向所述第一联邦学习客户端发送所述全局先验特征;
    所述第一联邦学习客户端接收来自所述联邦学习服务端的所述全局先验特征;
    所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理所述目标网络资源,其中,所述第二时隙与所述第一时隙相同或在所述第一时隙之后。
  25. 根据权利要求24所述的系统,其特征在于,所述第一网络数据为所述目标网络资源有关的数据在所述第一时隙的采样值或者从第三时隙到所述第一时隙为止的统计值,所述第二网络数据为所述目标网络资源有关的数据在所述第二时隙的采样值,所述第三时隙在所述第一时隙之前。
  26. 根据权利要求24或25所述的系统,其特征在于,所述全局先验特征为特征向量或第一机器学习模型,所述第一机器学习模型用于所述第二网络数据的推理。
  27. 根据权利要求26所述的系统,其特征在于,在所述全局先验特征为所述特征向量的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包括:
    所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果,所述第二机器学习模型用于所述第二网络数据的推理。
  28. 根据权利要求27所述的系统,其特征在于,所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:
    所述第一联邦学习客户端将第二网络数据输入到第三学习模型中,以得到所述第三学习模型输出的所述第二网络数据的多个特征;
    所述第一联邦学习客户端将所述全局先验特征输入到所述第二机器学习模型中,以获得所述第二网络数据的多个特征各自的权重;
    所述第一联邦学习客户端根据所述第二网络数据的多个特征以及所述第二网络数据的多个特征各自的权重,确定所述推理结果。
  29. 根据权利要求27所述的系统,其特征在于,所述第二机器学习模型包括多个第一任务模型;
    所述第一联邦学习客户端根据所述全局先验特征、第二网络数据以及本地的第二机器学习模型进行推理,以得到推理结果包括:
    所述第一联邦学习客户端根据所述全局先验特征,计算所述多个第一任务模型各自的权重;
    所述第一联邦学习客户端将第二网络数据的特征输入到所述多个第一任务模型中,以得到所述多个第一任务模型输出的推理特征;
    所述第一联邦学习客户端根据所述多个第一任务模型各自的权重以及所述多个第一任务模型输出的推理特征,得到推理结果。
  30. 根据权利要求29所述的系统,其特征在于,所述第一联邦学习客户端根据所述全局先验特征,计算所述多个第一任务模型各自的权重包括:
    所述第一联邦学习客户端根据所述全局先验特征和所述第二网络数据,计算所述多个第一任务模型各自的权重。
  31. 根据权利要求29或30所述的系统,其特征在于,所述系统还包括:
    所述第一联邦学习客户端通过第三机器学习模型提取所述第二网络数据的特征。
  32. 根据权利要求31所述的系统,其特征在于,所述第三机器学习模型包括多个第二任务模型;
    所述第一联邦学习客户端通过第三机器学习模型提取所述第二网络数据的特征包括:
    所述第一联邦学习客户端根据所述第二网络数据确定所述多个第二任务模型各自的权重;
    所述第一联邦学习客户端将所述第二网络数据输入到所述多个第二任务模型中,以得到所述多个第二任务模型输出的所述第二网络数据的子特征;
    根据所述多个第二任务模型各自的权重和所述多个第二任务模型输出的所述第二网络数据的子特征,得到所述第二网络数据的特征。
  33. 根据权利要求32所述的系统,其特征在于,每个第二任务模型是一层自编码器,所述多个第二任务模型中第r个任务模型的重构目标是第r-1个任务模型的残差,其中,r为大于1的整数且表示所述第二任务模型的数量。
  34. 根据权利要求26所述的系统,其特征在于,在所述全局先验特征为所述第一机器学习模型的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包括:
    所述第一联邦学习客户端提取所述第二网络数据的特征;
    所述第一联邦学习客户端将所述第二网络数据的特征输入到所述第一机器学习模型中,以得到所述第一机器学习模型输出的推理结果。
  35. 根据权利要求26所述的系统,其特征在于,在所述全局先验特征为所述第一机器学习模型的情况下,所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果包 括:
    所述第一联邦学习客户端利用样本数据对所述第一机器学习模型进行训练;
    所述第一联邦学习客户端提取所述第二网络数据的特征;
    所述第一联邦学习客户端将所述第二网络数据的特征输入到训练后的所述第一机器学习模型中,以得到训练后的所述第一机器学习模型输出的推理结果。
  36. 根据权利要求24至35中任意一项所述的系统,其特征在于,所述系统还包括:
    所述第一联邦学习客户端向所述联邦学习服务端发送分组信息,所述第一联邦学习客户端的分组信息指示所述第一局部特征所在的分组;
    所述联邦学习服务端接收来自所述第一联邦学习客户端的分组信息;
    所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征包括:
    所述联邦学习服务端根据所述第一局部特征、所述第一局部特征所在的分组、所述第二局部特征以及所述第二局部特征所在的分组得到全局先验特征,所述第二局部特征所在的分组是由来自所述第二联邦学习客户端的分组信息指示的。
  37. 根据权利要求36所述的系统,其特征在于,所述第一局部特征包括第一子特征和第二子特征,所述第二局部特征包括第三子特征和第四子特征,所述来自所述第一联邦学习客户端的分组信息指示所述第一子特征所在的分组和所述第二子特征所在的分组,所述来自所述第二联邦学习客户端的分组信息指示所述第三子特征所在的分组和所述第四子特征所在的分组,且所述第一子特征所在的分组和所述第三子特征所在的分组相同;
    所述联邦学习服务端根据所述第一局部特征、所述第一局部特征所在的分组、所述第二局部特征以及所述第二局部特征所在的分组得到全局先验特征包括:
    基于所述第一子特征所在的分组和所述第三子特征所在的分组相同,所述联邦学习服务端对所述第一子特征和所述第三子特征进行处理,得到中间特征;
    所述联邦学习服务端根据所述中间特征、所述第二子特征、所述第四子特征、所述第二子特征所在的分组以及所述第四子特征所在的分组,得到全局先验特征。
  38. 根据权利要求24至37中任意一项所述的系统,其特征在于,所述联邦学习服务端根据所述第一局部特征和第二局部特征,得到全局先验特征包括:
    所述联邦学习服务端根据所述第一局部特征、第二局部特征、来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征,得到全局先验特征。
  39. 根据权利要求38所述的系统,其特征在于,所述联邦学习服务端根据所述第一局部特征、第二局部特征、来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征,得到全局先验特征包括:
    所述联邦学习服务端计算当前推理过程的局部特征与多组历史局部特征的相似度,所述当前推理过程的局部特征包括所述第一局部特征和所述第二局部特征,每组历史局部特征包括一次历史推理过程中来自所述第一联邦学习客户端的历史局部特征以及来自所述第二联邦学习客户端的历史局部特征;
    所述联邦学习服务端根据所述当前推理过程的局部特征与多组历史局部特征的相似度,对所述多组历史局部特征对应的历史先验特征进行加权求和,以得到所述全局先验特征。
  40. 根据权利要求39所述的系统,其特征在于,所述多组历史局部特征都具有标签,所述标签为人为标注的每组所述历史局部特征的实际结果;
    所述系统还包括:
    所述第一联邦学习客户端向所述联邦学习服务端发送所述第一联邦学习客户端的推理结果;
    所述联邦学习服务端根据所述第一联邦学习客户端的推理结果和所述第二联邦学习客户端的推理结果确定目标推理结果;
    在所述当前推理过程的局部特征与目标组历史局部特征的相似度大于或等于阈值的情况下,所述联邦学习服务端根据所述当前推理过程的局部特征与所述目标组历史局部特征的相似度对所述目标组历史局部特征进行更新,所述目标组历史局部特征为所述多组历史局部特征中,标签为所述目标推理结果;
    在所述当前推理过程的局部特征与所述目标组历史局部特征的相似度小于所述阈值的情况下,所述联邦学习服务端在所述多组历史局部特征的基础上增加一组历史局部特征,增加的一组历史局部特征为所述当前推理过程的局部特征。
  41. 根据权利要求24至40中任意一项所述的系统,其特征在于,所述系统还包括:
    所述联邦学习服务端向所述第一联邦学习客户端发送任务同步信息,所述任务同步信息用于指示所述第一时隙;
    所述第一联邦学习客户端根据所述任务同步信息从本地数据中选择出所述第一网络数据。
  42. 一种推理装置,其特征在于,应用于第一联邦学习客户端,所述装置包括:
    收发单元,用于向联邦学习服务端发送第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;
    所述收发单元,用于接收来自所述联邦学习服务端的全局先验特征,所述全局先验特征是根据所述第一局部特征和第二局部特征得到的,所述第二局部特征是由第二联邦学习客户端提供;
    处理单元,用于根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理所述目标网络资源,其中,所述第二时隙与所述第一时隙相同或在所述第一时隙之后。
  43. 一种推理装置,其特征在于,应用于联邦学习服务端,所述装置包括:
    收发单元,用于接收来自第一联邦学习客户端的第一局部特征,所述第一局部特征是从第一网络数据中提取出的,所述第一网络数据为所述第一联邦学习客户端在第一时隙获取的与目标网络资源有关的数据,所述目标网络资源为所述第一联邦学习客户端管理的网络资源;
    处理单元,用于根据所述第一局部特征和第二局部特征,得到全局先验特征,所述第二局部特征是由第二联邦学习客户端提供;
    所述收发单元,用于向所述第一联邦学习客户端发送所述全局先验特征,使得所述第一联邦学习客户端根据所述全局先验特征和第二网络数据进行推理,以得到推理结果,所述第二网络数据为所述第一联邦学习客户端在第二时隙获取的与所述目标网络资源有关的数据,所述推理结果用于管理所述目标网络资源,其中,所述第二时隙与所述第一时隙相同或在所述第一时隙之后。
  44. 一种计算机设备,其特征在于,所述计算机设备包括:存储器和处理器;
    所述处理器,用于执行所述存储器中存储的计算机程序或指令,以使所述计算机设备执行如权利要求1-14中任一项所述的方法。
  45. 一种计算机设备,其特征在于,所述计算机设备包括:存储器和处理器;
    所述处理器,用于执行所述存储器中存储的计算机程序或指令,以使所述计算机设备执行如权利要求15-23中任一项所述的方法。
  46. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质具有程序指令,当所述程序指令被直接或者间接执行时,使得如权利要求1至23中任一所述的方法被实现。
  47. 一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器,所述处理器用于执行存储器中存储的计算机程序或指令,当所述计算机程序或所述指令在所述至少一个处理器中执行时,使得如权利要求1至23中任一所述的方法被实现。
  48. 一种计算机程序产品,其特征在于,包括指令,当所述指令在计算机上运行时,使得计算机执行权利要求1至23中任一项所述的方法。
  49. 一种通信系统,其特征在于,包括第一联邦学习客户端和联邦学习服务端,所述第一联邦学习客户端用于执行如权利要求1-14中任一项所述的方法,所述联邦学习服务器用于执行如权利要求15-23中任一项所述的方法。
PCT/CN2023/103784 2022-08-11 2023-06-29 一种推理方法及相关装置 WO2024032214A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210962642.4A CN117648981A (zh) 2022-08-11 2022-08-11 一种推理方法及相关装置
CN202210962642.4 2022-08-11

Publications (1)

Publication Number Publication Date
WO2024032214A1 true WO2024032214A1 (zh) 2024-02-15

Family

ID=89850661

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/103784 WO2024032214A1 (zh) 2022-08-11 2023-06-29 一种推理方法及相关装置

Country Status (2)

Country Link
CN (1) CN117648981A (zh)
WO (1) WO2024032214A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200321A (zh) * 2020-12-04 2021-01-08 同盾控股有限公司 基于知识联邦和图网络的推理方法、系统、设备及介质
WO2021115480A1 (zh) * 2020-06-30 2021-06-17 平安科技(深圳)有限公司 联邦学习方法、装置、设备和存储介质
CN112989944A (zh) * 2021-02-08 2021-06-18 西安翔迅科技有限责任公司 一种基于联邦学习的视频智能安全监管方法
CN113435604A (zh) * 2021-06-16 2021-09-24 清华大学 一种联邦学习优化方法及装置
CN114048838A (zh) * 2021-10-26 2022-02-15 西北工业大学 一种基于知识迁移的混合联邦学习方法
US20220114475A1 (en) * 2020-10-09 2022-04-14 Rui Zhu Methods and systems for decentralized federated learning
CN114580662A (zh) * 2022-02-28 2022-06-03 浙江大学 基于锚点聚合的联邦学习方法和系统
CN114861936A (zh) * 2022-05-10 2022-08-05 天津大学 一种基于特征原型的联邦增量学习方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021115480A1 (zh) * 2020-06-30 2021-06-17 平安科技(深圳)有限公司 联邦学习方法、装置、设备和存储介质
US20220114475A1 (en) * 2020-10-09 2022-04-14 Rui Zhu Methods and systems for decentralized federated learning
CN112200321A (zh) * 2020-12-04 2021-01-08 同盾控股有限公司 基于知识联邦和图网络的推理方法、系统、设备及介质
CN112989944A (zh) * 2021-02-08 2021-06-18 西安翔迅科技有限责任公司 一种基于联邦学习的视频智能安全监管方法
CN113435604A (zh) * 2021-06-16 2021-09-24 清华大学 一种联邦学习优化方法及装置
CN114048838A (zh) * 2021-10-26 2022-02-15 西北工业大学 一种基于知识迁移的混合联邦学习方法
CN114580662A (zh) * 2022-02-28 2022-06-03 浙江大学 基于锚点聚合的联邦学习方法和系统
CN114861936A (zh) * 2022-05-10 2022-08-05 天津大学 一种基于特征原型的联邦增量学习方法

Also Published As

Publication number Publication date
CN117648981A (zh) 2024-03-05

Similar Documents

Publication Publication Date Title
CN109035779B (zh) 基于DenseNet的高速公路交通流预测方法
US20160071005A1 (en) Event-driven temporal convolution for asynchronous pulse-modulated sampled signals
WO2023185539A1 (zh) 机器学习模型训练方法、业务数据处理方法、装置及系统
CN114202120A (zh) 一种针对多源异构数据的城市交通行程时间预测方法
Xu et al. Secure and reliable transfer learning framework for 6G-enabled Internet of Vehicles
CN116110022B (zh) 基于响应知识蒸馏的轻量化交通标志检测方法及系统
Shlezinger et al. Collaborative inference via ensembles on the edge
CN111047078A (zh) 交通特征预测方法、系统及存储介质
CN113554100A (zh) 异构图注意力网络增强的Web服务分类方法
CN115114409A (zh) 一种基于软参数共享的民航不安全事件联合抽取方法
CN114154647A (zh) 一种基于多粒度联邦学习的方法
Li et al. L-DETR: A light-weight detector for end-to-end object detection with transformers
WO2024032214A1 (zh) 一种推理方法及相关装置
CN117610734A (zh) 基于深度学习的用户行为预测方法、系统和电子设备
CN112562312A (zh) 一种基于融合特征的GraphSAGE交通路网数据预测的方法
CN115577797B (zh) 一种基于本地噪声感知的联邦学习优化方法及系统
CN116502709A (zh) 一种异质性联邦学习方法和装置
CN114970693B (zh) 一种基于联邦学习的充电桩用户画像方法
Zhong A convolutional neural network based online teaching method using edge-cloud computing platform
Mensah et al. EFedDNN: Ensemble based federated deep neural networks for trajectory mode inference
CN114548297A (zh) 基于领域自适应的数据分类方法、装置、设备及介质
CN113033653A (zh) 一种边-云协同的深度神经网络模型训练方法
Guillermo et al. Graph Query Language (GQL)-structured Algorithms for Geospatial Intelligence on Public Transportation
Feng et al. Intelligent Evaluation Mechanism for Cloud-Edge-End based Next Generation Ship Simulator towards Maritime Pilot Training
CN116311935B (zh) 一种基于大数据的智慧城市交通管理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23851441

Country of ref document: EP

Kind code of ref document: A1