CN113723434A

CN113723434A - Data processing method, device, equipment and storage medium

Info

Publication number: CN113723434A
Application number: CN202010438147.4A
Authority: CN
Inventors: 李爱华; 史嫄嫄
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Priority date: 2020-05-21
Filing date: 2020-05-21
Publication date: 2021-11-30

Abstract

The invention discloses a data processing method, a data processing device, data processing equipment and a storage medium. Wherein, the method comprises the following steps: the method comprises the steps that a first data analysis entity obtains first model information of a model determined by a second data analysis entity; aggregating the first model information to obtain aggregated second model information of the model; and sending the aggregated second model information to a second data analysis entity, wherein the aggregated second model information is used for the second data analysis entity to update the model. The embodiment of the invention can combine different data sets under the scenes of different manufacturers, cross-region and the like to complete the training process of the model so as to meet the requirements of data privacy, safety and the like of the cross-region and different manufacturers.

Description

Data processing method, device, equipment and storage medium

Technical Field

The present invention relates to the field of data processing, and in particular, to a data processing method, apparatus, device, and storage medium.

Background

In the network intelligent architecture, a data analysis entity is centrally responsible for data acquisition, module training and reasoning feedback, but in the current network practice stage, due to the consideration of privacy and security of data, the data analysis entity may not obtain relevant data for executing model training in application scenes such as different manufacturers, cross-regions and the like, and thus the model training cannot be completed.

Disclosure of Invention

In view of this, embodiments of the present invention provide a data processing method, apparatus, device, and storage medium, which aim to implement model training of network data based on a data analysis entity.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a data processing method which is applied to a first data analysis entity, and the method comprises the following steps:

acquiring first model information of a model determined by a second data analysis entity;

aggregating the first model information to obtain aggregated second model information of the model;

and sending the aggregated second model information to the second data analysis entity, wherein the aggregated second model information is used for the second data analysis entity to update the model.

The embodiment of the invention also provides a data processing method which is applied to a second data analysis entity, and the method comprises the following steps:

sending first model information of the model to a first data analysis entity;

receiving second aggregated model information of the model sent by the first data analysis entity;

updating the model based on the aggregated second model information.

The embodiment of the invention also provides a data processing method, which is applied to a data analysis entity, and the method comprises the following steps:

sending registration information to a network element storage function network element, where the registration information is used to register the data analysis entity in the network element storage function network element, and the registration information includes one or more of the following information corresponding to the data analysis entity: the type of the data analysis entity, the federal learning parameters, or the address information.

The embodiment of the invention also provides a data processing method, which is applied to the network element with the network element storage function, and the method comprises the following steps:

receiving registration information sent by a data analysis entity, wherein the registration information includes one or more of the following information corresponding to the data analysis entity: the type, federal learning parameters or address information of the data analysis entity;

and the network element storage function network element stores the registration information.

An embodiment of the present invention further provides a data processing apparatus, which is applied to a first data analysis entity, and the apparatus includes:

the first acquisition module is used for acquiring first model information of the model determined by the second data analysis entity;

the aggregation module is used for aggregating the first model information to obtain aggregated second model information of the model;

and the first sending module is used for sending the aggregated second model information to the second data analysis entity, and the aggregated second model information is used for the second data analysis entity to update the model.

The embodiment of the invention also provides a data processing device, which is applied to a second data analysis entity, and the device comprises:

the second sending module is used for sending the first model information of the model to the first data analysis entity;

a first receiving module, configured to receive second model information after aggregation of the models sent by the first data analysis entity;

and the model training module is used for updating the model based on the aggregated second model information.

An embodiment of the present invention further provides a data processing apparatus, which is applied to a data analysis entity, and the apparatus includes:

a third sending module, configured to send registration information to a network element storage function network element, where the registration information is used to register the data analysis entity in the network element storage function network element, and the registration information includes one or more of the following information corresponding to the data analysis entity: the type of the data analysis entity, the federal learning parameters, or the address information.

The embodiment of the invention also provides a data processing device, which is applied to the network element with the network element storage function, and the device comprises:

a second receiving module, configured to receive registration information sent by a data analysis entity, where the registration information includes one or more of the following information corresponding to the data analysis entity: the type, federal learning parameters or address information of the data analysis entity;

and the storage module is used for storing the registration information.

An embodiment of the present invention further provides a first network device, including: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor, when running the computer program, is configured to perform the steps of the method described in the first data analysis entity side of the embodiment of the present invention.

An embodiment of the present invention further provides a second network device, including: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor, when running the computer program, is configured to perform the steps of the method described in the second data analysis entity side of the embodiment of the present invention.

An embodiment of the present invention further provides a third network device, including: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor, when running the computer program, is configured to perform the steps of the method described by the data analysis entity of an embodiment of the present invention.

An embodiment of the present invention further provides a fourth network device, including: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor, when running the computer program, is configured to execute the steps of the method described in the network element storage function network element side of the embodiment of the present invention.

An embodiment of the present invention further provides a network system, including: the first network device and the second network device according to the embodiments of the present invention are configured such that the first network device is communicatively connected to at least two of the second network devices.

The embodiment of the present invention further provides a storage medium, where a computer program is stored on the storage medium, where the computer program is executed by a processor to implement the steps of the method according to any embodiment of the present invention.

According to the technical scheme provided by the embodiment of the invention, through the cooperation of the first data analysis entity and the second data analysis entity, the first data analysis entity aggregates the first model information of the model determined by the second data analysis entity to obtain the aggregated second model information of the model, and sends the aggregated second model information to the second data analysis entity, so that the second data analysis entity can update the model based on the aggregated second model information, and therefore, different data sets can be combined under the scenes of different manufacturers, cross-regions and the like, the training process of the model is completed, and the requirements of the cross-regions and the different manufacturers on data privacy, safety and the like are met.

Drawings

FIG. 1 is a schematic flow chart of a first data analysis entity-side data processing method according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a second data processing method performed by a data analysis entity according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a network system to which the data processing method of this embodiment of the present invention is applied;

FIG. 4 is a schematic structural diagram of a first data processing apparatus at a data analysis entity side according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a second data processing apparatus at a data analysis entity side according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a first network device according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a second network device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a third network device according to the embodiment of the present invention;

fig. 9 is a schematic structural diagram of a fourth network device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a network system according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

In the related art, under application scenarios of different manufacturers, cross-regions, and the like, a data analysis entity may not be able to obtain related data for executing model training, and thus, model training may not be completed. In this regard, in various embodiments of the present invention, through cooperation of a first data analysis entity (also referred to as a central data analysis entity) and a second data analysis entity (also referred to as a regional data analysis entity), the second data analysis entity performs model training based on a local data set, and transmitting the first model information of the model to a first data analysis entity, the first data analysis entity aggregating the first model information to obtain aggregated second model information of the model, and sending the aggregated second model information to a second data analysis entity, so that the second data analysis entity can update the model based on the aggregated second model information, therefore, different data sets can be combined under the scenes of different manufacturers, cross-region scenes and the like, the training process of the model is completed, and the requirements of data privacy, safety and the like of the cross-region and different manufacturers are met.

As shown in fig. 1, an embodiment of the present invention provides a data processing method applied to a first data analysis entity, where the method includes:

step 101, obtaining first model information of a model determined by a second data analysis entity;

here, the first Data analysis entity and the second Data analysis entity may be NWDAF (Network Data analysis Function), where the first Data analysis entity is a Data analysis entity that aggregates first model information of models of the second Data analysis entities, and the second Data analysis entity is a Data analysis entity that needs to aggregate first model information of corresponding models. Each data analysis entity may also be: a Management Data analysis Function (Management Data analysis Function) on the network Management side, a Management Data analysis Service (Management Data analysis Service), a Non-Real Time RAN Intelligent Controller or Near-Real Time RAN Intelligent Controller in an Open-RAN on the RAN (radio access network) side, a built-in AI (artificial intelligence) module on the UE side, and a third party AF (Application Function), such as arri, Google cloud, amazon AWS, etc. The embodiment of the invention does not specifically limit the data analysis entity.

Here, the first model information may be understood as model parameters of a model of the local region generated by the second data analysis entity based on the data of the local region, and the first model information of the model determined by the second data analysis entity includes one or more of the following information: number of samples, model gradient value, model creation time.

In some embodiments, a first data analysis entity sends an analysis request to a second data analysis entity, the analysis request including parameters for the second data analysis entity to determine the first model information. The parameters include one or more of the following information: maximum response time, maximum number of iterations, data type list. The maximum response time is used for the second data analysis entity to determine the effective duration of reporting the first model information, and if the effective duration is exceeded, the corresponding first model information is invalid. The maximum iteration times are used for the second data analysis entity to determine the iteration times for local training, and if the maximum iteration times are exceeded, the overall model training progress is influenced. The data type list is used to determine the data type for data analysis.

In some embodiments, the analysis request (Analytics) may be generated by the first data analysis entity based on an analysis request sent by a Network Function (NF) or a subscribed analysis request.

Based on this, in some embodiments, before the first data analysis entity sends the analysis request to the second data analysis entity, the method further comprises:

the first data analysis entity generates the analysis request based on an analysis request sent by the NF or a subscribed analysis request.

Here, the analysis request may be various analysis requirements related to network performance, such as UE (user equipment) Mobility analysis (Mobility analysis) and the like.

In practical application, the first data analysis entity further needs to obtain a network address of the second data analysis entity corresponding to the analysis request. Based on this, in some embodiments, the method further comprises: the first data analysis entity determines address information of the second data analysis entity, which includes:

the first data analysis entity determines address information of a second data analysis entity based on preconfigured address information of the second data analysis entity; alternatively, the first and second electrodes may be,

and the first data analysis entity determines the address information of the second data analysis entity through a network element storage function network element.

Here, the first data analysis entity determines, based on pre-configured address information of the second data analysis entity, that the address information of the second data analysis entity is to discover the second data analysis entity based on a static manner; and the first data analysis entity determines the address information of the second data analysis entity to be the second data analysis entity discovered based on a dynamic mode through a network element storage function network element. The network element with the network element storage Function in the embodiment of the present invention may be an NRF (NF redundancy Function) network element, a certain DNS (domain name server) network element, or an UDR (unified data Repository) network element.

In an application example, for the static manner, the parameter information of each second data analysis entity in the jurisdiction is preconfigured on the first data analysis entity, and the parameter information may include: address information, service area, federal learning parameters (whether federal learning is supported, algorithms supported), hierarchical information, etc. of the second data analysis entity. The first data analysis entity determines address information of at least two second data analysis entities matching the analysis request based on the preconfigured parameter information of the second data analysis entities, for example, determines address information of the at least two matching second data analysis entities based on an analysis target, an analysis region, and the like in the analysis request, and sends the analysis request to the corresponding second data analysis entities.

For the above dynamic mode, the determining, by the first data analysis entity through the network element storage function network element, the address information of the second data analysis entity includes:

the first data analysis entity sends a first request to a network element storage function network element, where the first request is used to request address information of the second data analysis entity, and the first request includes one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the first model information, and the load threshold of the second data analysis entity;

and receiving the address information of the second data analysis entity determined by the network element storage function network element based on the first request.

Here, the capability information may include one or more of the following: whether federated learning is supported, algorithms are supported.

In an application example, each second data analysis entity registers its own address information, service area, federal learning parameters (whether federal learning is supported, supported algorithm), hierarchical information, and the like to the network element storage function. The first data analysis entity requests address information of each second data analysis entity in the jurisdiction from the network element storage function according to the area covered by the first data analysis entity, the federal learning parameters, the layering information and the like based on the analysis request, and sends the analysis request to the corresponding second data analysis entity after acquiring the address information of the second data analysis entity matched with the analysis request.

Here, after receiving the analysis request sent by the first data analysis entity, each second data analysis entity acquires data based on the analysis request, and generates a data set for model training. For example, the second data analysis entity collects network data from a network element, a network manager, an AF (Application Function), a RAN (radio access network), or a UE, and forms a local data set.

And the second data analysis entity performs initial model training based on the local data set until the training is finished to obtain the first model information of the model. Here, the termination condition of the model training includes at least one of:

the loss function of the model reaches a value less than or equal to a set threshold value;

the training times of the model reach preset times;

the training time of the model reaches the set time.

In practical application, the set duration of the model training may be determined based on the maximum response time in the parameters, and the preset number of times of the model training may be determined based on the maximum number of iterations in the parameters.

The second data analysis entity sends first model information (e.g., model gradient values and number of samples of the data set) of the model after training is finished to the first data analysis entity.

Here, the model gradient value refers to a gradient value of a loss function value of the model to a model parameter.

In an application example, assume that the model parameters of the model of the second data analysis entity are Θ_IValue of loss function is L_IThen the gradient value of the model is

Wherein I ═ a, B, …, K, respectively, represent different second data analysis entities.

Step 102, aggregating the first model information to obtain aggregated second model information of the model;

here, the aggregation refers to weighted summation of the first model information of the model of the second data analysis entity based on the proportion of the number of samples. The first data analysis entity may employ Federated Learning (fed Learning) to aggregate the first model information of the second data analysis entity. Federal learning refers to: all data are kept locally, so that privacy is not disclosed and regulations are not violated; each participant combines data to establish a virtual common model and a system which benefits jointly; under a federal learning system, the identity and the status of each participant are the same; the modeling effect of federated learning is the same as, or comparable to, the effect of placing the entire data set in one place for modeling. In practical application, the first data analysis entity may aggregate the first model information of at least two second data split entities to obtain aggregated second model information of the model.

In some embodiments, the first data analysis entity aggregates the number of samples and the model gradient value of each second data analysis entity to obtain aggregated second model information. Specifically, aggregating the sample number and the model gradient value of each second data analysis entity to obtain aggregated second model information includes:

and carrying out weighted summation on the model gradient values of the second data analysis entities based on the proportion of the number of the samples, and determining the aggregated second model information based on the result of the weighted summation.

In an application example, the aggregated second model information is represented as follows:

wherein | K | is the number of second data analysis entities,

step 103, sending the aggregated second model information to the second data analysis entity, where the aggregated second model information is used by the second data analysis entity to update the model.

In some embodiments, the second data analysis entity updates the model based on the aggregated second model information, including:

updating first model information corresponding to the model based on the aggregated second model information;

and performing model training based on the updated first model information and the data set until the training is finished to obtain an updated model.

In an application example, updating the first model information corresponding to the model based on the aggregated second model information is represented as follows:

where α is a correction coefficient.

And the second data analysis entity performs model training based on the updated first model information and the data set, determines that the loss function of the model reaches or is less than a set threshold value, or the training times of the model reach preset times, or the training duration of the model reaches set duration, and obtains an updated model.

According to the data processing method provided by the embodiment of the invention, the first data analysis entity does not need to acquire the local data set of each second data analysis entity, only needs to acquire the first model information (sample number and model gradient value) of model training sent by each second data analysis entity, aggregates the first model information of the model of each second data analysis entity to obtain the aggregated second model information, and sends the aggregated second model information to each second data analysis entity, so that each second data analysis entity can update the model based on the received aggregated second model information.

An embodiment of the present invention further provides a data processing method, as shown in fig. 2, applied to a second data analysis entity, where the method includes:

step 201, sending first model information of a model to a first data analysis entity;

here, the first model information may be understood as a model parameter that is generated by the second data analysis entity based on the data of the local region and is applicable to the model of the local region. The first model information may include one or more of the following information: number of samples, model gradient value, model creation time. For example, the first model information may be a sample number and a model gradient value, where the sample number refers to a sample number corresponding to a local data set, and the model gradient value refers to a gradient value of a loss function value of a model to a model parameter.

In some embodiments, the data processing method further comprises:

receiving an analysis request sent by a first data analysis entity, wherein the analysis request comprises parameters for determining the first model information by the second data analysis entity;

obtaining data based on the analysis request;

first model information of a model is determined based on the data.

Here, the second data analysis entity receives the analysis request transmitted by the first data analysis entity, and the analysis request may be an analysis request generated by the first data analysis entity based on an analysis request transmitted by the NF or a subscribed analysis request. The first data analysis entity may find a second data analysis entity matched in the jurisdiction based on a static manner or a dynamic manner, and send an analysis request to the corresponding second data analysis entity, which may specifically refer to the foregoing description, and is not described herein again.

Here, the second data analysis entity may collect network data from a network element, network management, AF, RAN or UE, etc. based on the analysis request, forming a local data set.

Here, the second data analysis entity performs initial model training based on a local data set formed by the data until the training is finished, and obtains first model information of the model. Here, the termination condition of the model training includes at least one of:

the training times of the model reach preset times;

the training time of the model reaches the set time.

Here, the first model information of the model includes one or more of the following information: number of samples, model gradient value, model creation time.

In some embodiments, the analysis request includes parameters for the second data analysis entity to determine the first model information, the parameters including one or more of the following information: maximum response time, maximum number of iterations, data type list. In practical application, the set duration of the model training may be determined based on the maximum response time in the parameters, and the preset number of times of the model training may be determined based on the maximum number of iterations in the parameters.

Step 202, receiving the aggregated second model information of the model sent by the first data analysis entity;

here, the first data analysis entity performs aggregation operation on the number of samples and the model gradient value of each second data analysis entity to obtain aggregated second model information.

Step 203, updating the model based on the aggregated second model information.

In some embodiments, said updating said model based on said aggregated second model information comprises:

And the second data analysis entity performs model training based on the updated model parameters and the data set, determines that the loss function of the model reaches or is less than a set threshold value, or the training times of the model reach preset times, or the training duration of the model reaches set duration, and obtains an updated model.

Therefore, the second data analysis entity can update the model based on the aggregated second model information, so that different data sample sets can be combined to complete the training of the model, and the requirements of data privacy and safety of cross-region and different manufacturers are met.

Here, the data analysis entity may be the first data analysis entity or the second data analysis entity, and each data analysis entity may implement registration at the network element storage function network element side by sending registration information to the network element storage function network element.

Here, the first data analysis entity is of a server, a coordinator or a central trainer type, and the second data analysis entity is of a client, a local trainer or a user side type.

Here, the federal learning parameters may include one or more of the following information corresponding to federal learning: algorithm type, algorithm identification, or algorithm convergence speed. In some embodiments, the registration information may further include: service area, tier information, etc.

In an embodiment, the data analysis entity is a first data analysis entity, the method further comprising:

the first data analysis entity sends a first request to the network element storage function network element, where the first request is used to request address information of a second data analysis entity, and the first request includes one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the first model information, and the load threshold of the second data analysis entity;

In this way, the first data analysis entity may request, based on the first request, the network element storage function network element for address information of a corresponding second data analysis entity in the jurisdiction, and send an analysis request based on the address information of the corresponding second data analysis entity.

and storing the registration information.

In this way, the network element storage function network element may enable registration of the first data analysis entity and/or the second data analysis entity.

In some embodiments, the method further comprises:

receiving a first request sent by a first data analysis entity, wherein the first request is used for requesting address information of a second data analysis entity, and the first request comprises one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the first model information and the load threshold of the second data analysis entity;

determining address information of the second data analysis entity based on the first request, and sending the address information of the second data analysis entity to the first data analysis entity.

In this way, the network element storage function network element may send, to the first data analysis entity, the address information of the corresponding second data analysis entity in the jurisdiction of the first data analysis entity based on the first request sent by the first data analysis entity, so that the first data analysis entity can send the analysis request based on the address information of the corresponding second data analysis entity.

The present invention will be described in further detail with reference to the following application examples.

As shown in fig. 3, in the embodiment of the present application, the first data analysis entity is a central data analysis entity, the second data analysis entity includes a regional data analysis entity, and the second data analysis entity may be co-located in 5G NF (e.g., AMF or SMF) with 5G NF such as AMF (Access and Mobility Management Function), SMF (Session Management Function), and the like. It should be noted that the central data analysis entity may be an independent data analysis entity, or may be deployed in a fusion manner with other data analysis entities, which is not specifically limited in this embodiment of the present invention.

The data processing method in this embodiment is described below with reference to fig. 3, and the data processing method may be a network data processing method, a terminal data processing method, or a third-party service data processing method, that is, the central data analysis entity may perform data processing on network data, terminal data, or third-party service data, so that the regional data analysis entity on the network side, the terminal side, or the third-party service side can complete model training. Here, the first data analysis entity may be a serving NWDAF (server NWDAF), a central NWDAF (centered NWDAF), or a coordinated NWDAF (coordinated NWDAF), and the second data analysis entity may be a Local NWDAF (area NWDAF), a Distributed NWDAF (Distributed NWDAF), or a Client NWDAF (Client NWDAF). The method specifically comprises the following steps:

step 1), the NF sends an analysis request (Analytics) to the central data analysis entity, requesting or subscribing to a relevant analysis result (e.g. UE mobility analysis result: mobility Analytics).

And 2), the central data analysis entity sends an analysis request to each regional data analysis entity (including the data analysis entity combined with the NF).

Here, the central data analysis entity may send the analysis request to the regional data analysis entity based on a static manner or a dynamic manner.

For the static mode, parameter information of each regional data analysis entity in the district is preconfigured on the central data analysis entity, and the parameter information may include: network address of regional data analysis entity, service region, federal learning parameters (whether or not federal learning is supported, supported algorithms), hierarchical information, etc. The central data analysis entity determines at least two regional data analysis entities matching the analysis request based on the parameter information of the preconfigured regional data analysis entities, for example, determines at least two matching regional data analysis entities based on the analysis target, the analysis region, and the like in the analysis request, and sends the analysis request to the corresponding regional data analysis entities.

For the dynamic mode, each regional data analysis entity or 5G NF (integrated with the data analysis entity) registers its own network address, service region, federal learning parameters (whether federal learning is supported, supported algorithm), hierarchical information, etc. to the NRF. The central data analysis entity requests data analysis entity addresses of all sub-areas in the jurisdiction from the NRF according to the areas covered by the central data analysis entity, the federal learning parameters, the layered information and the like, and sends analysis requests to corresponding area data analysis entities.

And step 3), the regional data analysis entity collects network data from network elements, network managers, AF, RAN or UE and the like to form a local data set.

Here, the local data sets formed by the regional data analysis entities are respectively

Wherein x is the dimensional characteristic of the network data, y is the label corresponding to the analysis requirement, D_A，D_B，…D_KThe local data sets i, j, …, k respectively represent the serial numbers corresponding to the sample data in the data sets.

And 4), training the model by the regional data analysis entity based on the model parameters of the model and the loss function of the model until the training is finished to obtain the model.

Here, the models of the respective regional data analysis entities are respectively expressed as

Wherein, theta_A，Θ_B，…Θ_KThe model parameters corresponding to the respective models are respectively represented.

The loss function for each model is represented as follows:

the end condition of the model training comprises at least one of the following conditions:

the training times of the model reach preset times;

the training time of the model reaches the set time.

For example, the regional data analysis entity determines that the loss function value is less than or equal to a threshold value (e.g., L)_ILess than 0.1), or the iteration times are more than or equal to the preset times (such as 1 ten thousand times), the regional data analysis entity terminates the training process of the model and obtains the local model. In a data set D_AFor example, a model was obtained:

step 5), the regional data analysis entity sends the number n of samples of the local data set to the central data analysis entity_IAnd the value of the model gradient

And 6), the central data analysis entity completes the operation of model gradient value aggregation and issues the result after the aggregation operation to each regional data analysis entity.

Here, the result of the aggregation operation is expressed as follows:

wherein | K | is the number of region data analysis entities,

and 7), updating model parameters of the model by each regional data analysis entity based on the result of the aggregation operation, and returning to the step 4) for model training to obtain the updated model.

Here, updating the model parameters corresponding to the model based on the result of the aggregation operation is represented as follows:

where α is a correction coefficient. In this way, the model parameters corrected based on the result of the aggregation operation can be obtained.

And the second data analysis entity performs model training based on the updated model parameters and the data set, determines that the loss function of the model is less than or equal to a set threshold value, or determines that the training times of the model reach preset times, obtains an updated model, and specifically refers to the training process in the step 4).

According to the data processing method, the central data analysis entity is matched with the regional data analysis entity, different data sample sets are combined under the scenes of different manufacturers, different regions and the like based on the intelligent architecture of the data analysis entity, the training process of the model is completed, and the requirements of the data privacy, the safety and the like of the different manufacturers and the different manufacturers are met.

In order to implement the first data analysis entity-side data processing method according to the embodiment of the present invention, an embodiment of the present invention further provides a data processing apparatus, which corresponds to the first data analysis entity-side data processing method, and each step in the first data analysis entity-side data processing method embodiment is also completely applicable to the embodiment of the data processing apparatus.

As shown in fig. 4, the data processing apparatus includes: a first sending module 401, a first obtaining module 402 and an aggregation module 403. The first obtaining module 402 is configured to obtain first model information of a model determined by a second data analysis entity; the aggregation module 403 is configured to aggregate the first model information to obtain aggregated second model information of the model; the first sending module 401 is configured to send the aggregated second model information to the second data analysis entity, so that the second data analysis entity updates the model based on the aggregated second model information.

In some embodiments, the first sending module 401 is further configured to send an analysis request to a second data analysis entity, the analysis request including at least parameters for the second data analysis entity to determine the first model information, the parameters including one or more of the following information: maximum response time, maximum number of iterations, data type list.

In some embodiments, the data processing apparatus further comprises: a determining module 404, configured to determine address information of a second data analysis entity, where the determining module 404 is specifically configured to:

determining address information of a second data analysis entity based on preconfigured address information of the second data analysis entity; alternatively, the first and second electrodes may be,

and determining the address information of the second data analysis entity through a network element storage function network element.

In some embodiments, the determining module 404 is specifically configured to:

sending a first request to a network element storage function network element, where the first request is used to request address information of the second data analysis entity, and the first request includes one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the model information, and the load threshold of the second data analysis entity;

In some embodiments, the data processing apparatus further comprises: the analysis request generation module 405 generates the analysis request based on the analysis request sent by the NF or the subscribed analysis request.

In some embodiments, the aggregation module 403 is specifically configured to:

In actual application, the first sending module 401, the first obtaining module 402, the aggregating module 403, the determining module 404, and the analysis request generating module 405 may be implemented by a processor in a data processing apparatus. Of course, the processor needs to run a computer program in memory to implement its functions.

In order to implement the second data analysis entity-side data processing method according to the embodiment of the present invention, an embodiment of the present invention further provides a data processing apparatus, which corresponds to the second data analysis entity-side data processing method described above, and each step in the second data analysis entity-side data processing method embodiment is also fully applicable to the present data processing apparatus embodiment.

As shown in fig. 5, the data processing apparatus includes: the system comprises a first receiving module 501, a model training module 503 and a second sending module 504, wherein the second sending module 504 is used for sending first model information of a model to a first data analysis entity; the first receiving module 501 is configured to receive the aggregated second model information of the model sent by the first data analysis entity; the model training module 503 is configured to update the model based on the aggregated second model information.

In some embodiments, the first receiving module 501 is further configured to receive an analysis request sent by a first data analysis entity, where the analysis request includes parameters for the second data analysis entity to determine the first model information; the data processing apparatus further includes: a second obtaining module 502, wherein the second obtaining module 502 is configured to obtain data based on the analysis request; the model training module 503 is further configured to determine first model information of the model based on the data.

In some embodiments, model training module 503 is specifically configured to:

In some embodiments, the analysis request includes parameters for the second data analysis entity to determine the first model information, the parameters including one or more of the following information: maximum response time, maximum number of iterations, data type list.

In some embodiments, the model training module 503 determines that the end condition of the model training includes at least one of:

the training times of the model reach preset times;

the training time of the model reaches the set time.

In practical applications, the first receiving module 501, the second obtaining module 502, the model training module 503 and the second sending module 504 may be implemented by a processor in a data processing apparatus. Of course, the processor needs to run a computer program in memory to implement its functions.

It should be noted that: in the data processing apparatus provided in the above embodiment, when performing data processing, only the division of each program module is exemplified, and in practical applications, the processing may be distributed to different program modules according to needs, that is, the internal structure of the apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the data processing apparatus and the data processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.

In order to implement the data processing method of the data analysis entity in the embodiment of the present invention, an embodiment of the present invention further provides a data processing apparatus, which corresponds to the data processing method of the data analysis entity, and each step in the data processing method of the data analysis entity is also completely applicable to the embodiment of the data processing apparatus. The data processing apparatus includes:

In some embodiments, the data analysis entity is a first data analysis entity, and the third sending module is further configured to: sending a first request to the network element storage function network element, where the first request is used to request address information of a second data analysis entity, and the first request includes one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the first model information, and the load threshold of the second data analysis entity; the first data analysis entity is further configured to receive address information of the second data analysis entity, which is determined by the network element storage function network element based on the first request.

In order to implement the data processing method on the network element side with the data network element storage function according to the embodiment of the present invention, an embodiment of the present invention further provides a data processing apparatus, where the data processing apparatus corresponds to the data processing method on the network element side with the network element storage function, and each step in the data processing method on the network element side with the network element storage function is also completely applicable to the embodiment of the data processing apparatus. The data processing apparatus includes: a second receiving module and a storage module. The second receiving module is configured to receive registration information sent by a data analysis entity, where the registration information includes one or more of the following information corresponding to the data analysis entity: the type, federal learning parameters or address information of the data analysis entity; the storage module is used for storing the registration information.

In some embodiments, the second receiving module is further configured to receive a first request sent by the first data analysis entity, where the first request is used to request address information of the second data analysis entity, and the first request includes one or more of the following information: the area information corresponding to the first data analysis entity, the capability information of the second data analysis entity for determining the first model information and the load threshold of the second data analysis entity; the data processing apparatus further includes: an address determining module, configured to determine address information of the second data analysis entity based on the first request, and send the address information of the second data analysis entity to the first data analysis entity.

Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present invention, an embodiment of the present invention further provides a first network device. Fig. 6 shows only an exemplary structure of the first network device, not a whole structure, and a part or the whole structure shown in fig. 6 may be implemented as necessary.

As shown in fig. 6, a first network device 600 provided in an embodiment of the present invention includes: at least one processor 601, memory 602, user interface 603, and at least one network interface 604. The various components in the first network device 600 are coupled together by a bus system 605. It will be appreciated that the bus system 605 is used to enable communications among the components. The bus system 605 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 605 in fig. 6.

The user interface 603 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

The memory 602 in embodiments of the present invention is used to store various types of data to support the operation of the first network device. Examples of such data include: any computer program for operating on a first network device.

The data processing method disclosed by the embodiment of the invention can be applied to the processor 601 or implemented by the processor 601. The processor 601 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the data processing method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 601. The Processor 601 may be a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 601 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 602, and the processor 601 reads the information in the memory 602 and performs the steps of the data processing method provided by the embodiment of the present invention in combination with the hardware thereof.

In an exemplary embodiment, the first network Device may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), FPGAs, general purpose processors, controllers, Micro Controllers (MCUs), microprocessors (microprocessors), or other electronic components for performing the aforementioned methods.

Based on the hardware implementation of the program module, and in order to implement the data processing method on the second data analysis entity side in the embodiment of the present invention, an embodiment of the present invention further provides a first network device. Fig. 7 shows only an exemplary structure of the first network device, not a whole structure, and a part or the whole structure shown in fig. 7 may be implemented as necessary.

As shown in fig. 7, a second network device 700 provided in an embodiment of the present invention includes: at least one processor 701, memory 702, user interface 703, and at least one network interface 704. The various components in the second network device 700 are coupled together by a bus system 705. It will be appreciated that the bus system 705 is used to enable communications among the components. The bus system 705 includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration the various busses are labeled in figure 7 as the bus system 705.

The user interface 703 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

The memory 702 in embodiments of the present invention is used to store various types of data to support the operation of the second network device. Examples of such data include: any computer program for operating on a second network device.

The data processing method disclosed by the embodiment of the invention can be applied to the processor 701, or implemented by the processor 701. The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the data processing method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 701. The processor 701 described above may be a general purpose processor, a digital signal processor, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 701 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 702, and the processor 701 may read information in the memory 702 and complete the steps of the data processing method provided by the embodiments of the present invention in combination with hardware thereof.

In an exemplary embodiment, the second network device 700 may be implemented by one or more ASICs, DSPs, PLDs, CPLDs, FPGAs, general-purpose processors, controllers, MCUs, microprocessors, or other electronic components for performing the aforementioned methods.

Based on the hardware implementation of the program module, and in order to implement the data processing method at the data analysis entity side in the embodiment of the present invention, the embodiment of the present invention further provides a third network device. Fig. 8 shows only an exemplary structure of the first network device, not a whole structure, and a part or the whole structure shown in fig. 8 may be implemented as necessary.

As shown in fig. 8, a third network device 800 according to an embodiment of the present invention includes: at least one processor 801, memory 802, a user interface 803, and at least one network interface 804. The various components in the third network device 800 are coupled together by a bus system 805. It will be appreciated that the bus system 805 is used to enable communications among the components of the connection. The bus system 805 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 805 in fig. 8.

The user interface 803 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

The memory 802 in embodiments of the present invention is used to store various types of data to support the operation of the third network device. Examples of such data include: any computer program for operating on a third network device.

The data processing method disclosed by the embodiment of the invention can be applied to the processor 801 or implemented by the processor 801. The processor 801 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the data processing method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 801. The processor 801 described above may be a general purpose processor, digital signal processor, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 801 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 802, and the processor 801 reads the information in the memory 802, and performs the steps of the data processing method provided by the embodiment of the present invention in combination with the hardware thereof.

In an exemplary embodiment, the third network device 800 may be implemented by one or more ASICs, DSPs, PLDs, CPLDs, FPGAs, general-purpose processors, controllers, MCUs, microprocessors, or other electronic components for performing the aforementioned methods.

Based on the hardware implementation of the program module, and in order to implement the data processing method on the network element side of the network element storage function according to the embodiment of the present invention, a fourth network device is further provided in the embodiment of the present invention. Fig. 9 shows only an exemplary structure of the first network device, not a whole structure, and a part of or the whole structure shown in fig. 9 may be implemented as necessary.

As shown in fig. 9, a fourth network device 900 according to an embodiment of the present invention includes: at least one processor 901, memory 902, a user interface 903, and at least one network interface 904. The various components in the fourth network device 900 are coupled together by a bus system 905. It will be appreciated that the bus system 905 is used to enable communications among the components. The bus system 905 includes a power bus, a control bus, and a status signal bus, in addition to a data bus. For clarity of illustration, however, the various buses are labeled in fig. 9 as bus system 905.

The user interface 903 may include a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, a touch screen, or the like, among others.

The memory 902 in the embodiments of the present invention is used to store various types of data to support the operation of the fourth network device. Examples of such data include: any computer program for operating on a fourth network device.

The data processing method disclosed by the embodiment of the present invention may be applied to the processor 901, or implemented by the processor 901. The processor 901 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the data processing method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 901. The processor 901 described above may be a general purpose processor, a digital signal processor, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 901 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 902, and the processor 901 reads information in the memory 902, and performs the steps of the data processing method provided by the embodiment of the present invention in combination with hardware thereof.

In an exemplary embodiment, the fourth network device 900 may be implemented by one or more ASICs, DSPs, PLDs, CPLDs, FPGAs, general-purpose processors, controllers, MCUs, microprocessors, or other electronic components for performing the aforementioned methods.

It will be appreciated that the

memories

602, 702, 802, 902 may be either volatile or nonvolatile memories, and may include both volatile and nonvolatile memories. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The described memory for embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.

An embodiment of the present invention further provides a network system, as shown in fig. 10, including a first network device 600 and a second network device 700, where the first network device 600 is used as a first data analysis entity, the second network device 700 is used as a second data analysis entity, and the first network device 600 is in communication connection with at least two second network devices 700, and is used to execute the data processing method described in the foregoing embodiment, and a specific data processing method is described with reference to the foregoing embodiment, and is not described herein again.

In an exemplary embodiment, the embodiment of the present invention further provides a storage medium, specifically a computer storage medium, which may be a computer readable storage medium, for example, a memory 602 storing a computer program, where the computer program is executable by a processor 601 of the first network device 600 to perform the steps described in the first data analysis entity-side method according to the embodiment of the present invention; for another example, the apparatus includes a memory 702 storing a computer program, which can be executed by the processor 701 of the second network device 700 to perform the steps described in the second data analysis entity side method according to the embodiment of the present invention; as another example, a memory 802 storing a computer program executable by the processor 801 of the third network device 800 to perform the steps of the data analysis entity-side method according to the embodiments of the present invention is included; for another example, the network device includes a memory 902 for storing a computer program, and the computer program can be executed by the processor 901 of the fourth network device 900 to complete the steps described in the network element side method of the network element storage function according to the embodiment of the present invention. The computer readable storage medium may be a ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM, among others.

It should be noted that: "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

In addition, the technical solutions described in the embodiments of the present invention may be arbitrarily combined without conflict.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A data processing method applied to a first data analysis entity, the method comprising:

2. The method of claim 1, further comprising:

the first data analysis entity sends an analysis request to a second data analysis entity, the analysis request including parameters for the second data analysis entity to determine the first model information.

3. The method according to claim 1 or 2, characterized in that the method further comprises: the first data analysis entity determining address information of a second data analysis entity, comprising:

the first data analysis entity determining address information of a second data analysis entity based on preconfigured address information of the second data analysis entity; alternatively, the first and second electrodes may be,

4. The method of claim 3, wherein the determining, by the first data analysis entity, the address information of the second data analysis entity through a network element storage function network element comprises:

5. The method of claim 2, wherein before the first data analysis entity sends an analysis request to the second data analysis entity, the method further comprises:

the first data analysis entity generates the analysis request based on an analysis request sent by a network function NF or a subscribed analysis request.

6. A data processing method applied to a second data analysis entity, the method comprising:

sending first model information of the model to a first data analysis entity;

updating the model based on the aggregated second model information.

7. The method of claim 6, wherein the method comprises:

obtaining data based on the analysis request;

first model information of a model is determined based on the data.

8. The method of claim 6, wherein updating the model based on the aggregated second model information comprises:

and performing model training based on the updated first model information and the data until the training is finished to obtain an updated model.

9. The method of claim 8, wherein the end condition of model training comprises at least one of:

the training times of the model reach preset times;

the training time of the model reaches the set time.

10. A data processing method applied to a data analysis entity, the method comprising:

11. The method of claim 10, wherein the data analysis entity is a first data analysis entity or a second data analysis entity, wherein the first data analysis entity is of a server, a coordinator or a central trainer type, and wherein the second data analysis entity is of a client, a local trainer or a client side type.

12. The method of claim 11, wherein the second data analysis entity is co-located with 5G NF.

13. The method of claim 10, wherein the federal learning parameters include one or more of the following information for the federal learning: algorithm type, algorithm identification, or algorithm convergence speed.

14. The method of claim 10, wherein the data analysis entity is a first data analysis entity, the method further comprising:

15. A data processing method, applied to a network element with a network element storage function, the method comprising:

16. The method of claim 15, wherein the data analysis entity is a first data analysis entity or the second data analysis entity, wherein the first data analysis entity is of a server, a coordinator or a central trainer type, and wherein the second data analysis entity is of a client, a local trainer or a client side type.

17. The method of claim 15, wherein the federal learning parameters include one or more of the following information for the federal learning: algorithm type, algorithm identification, or algorithm convergence speed.

18. The method of claim 15, wherein the method further comprises:

19. A data processing apparatus for application to a first data analysis entity, the apparatus comprising:

20. A data processing apparatus for application to a second data analysis entity, the apparatus comprising:

21. A data processing apparatus for application to a data analysis entity, the apparatus comprising:

22. A data processing apparatus, applied to a network element with a network element storage function, the apparatus comprising:

and the storage module is used for storing the registration information.

23. A first network device, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor, when executing the computer program, is adapted to perform the steps of the method of any of claims 1 to 5.

24. A second network device, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor, when executing the computer program, is adapted to perform the steps of the method of any of claims 6 to 9.

25. A third network device, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor, when executing the computer program, is adapted to perform the steps of the method of any of claims 10 to 14.

26. A fourth network device, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor, when executing the computer program, is adapted to perform the steps of the method of any of claims 15 to 18.

27. A network system, comprising: a first network device according to claim 23 and a second network device according to claim 24, the first network device being communicatively connected to at least two of the second network devices.

28. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, performs the steps of the method of any one of claims 1 to 18.