WO2024093561A1 - 模型训练方法、模型测试方法、装置及存储介质 - Google Patents

模型训练方法、模型测试方法、装置及存储介质 Download PDF

Info

Publication number
WO2024093561A1
WO2024093561A1 PCT/CN2023/120161 CN2023120161W WO2024093561A1 WO 2024093561 A1 WO2024093561 A1 WO 2024093561A1 CN 2023120161 W CN2023120161 W CN 2023120161W WO 2024093561 A1 WO2024093561 A1 WO 2024093561A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
training
test
fusion
indication information
Prior art date
Application number
PCT/CN2023/120161
Other languages
English (en)
French (fr)
Inventor
舒敏
Original Assignee
大唐移动通信设备有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大唐移动通信设备有限公司 filed Critical 大唐移动通信设备有限公司
Publication of WO2024093561A1 publication Critical patent/WO2024093561A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic

Definitions

  • the present disclosure relates to the field of communications, and in particular to a model training method, a model testing method, a device and a storage medium.
  • 3GPP (3rd Generation Partnership Project) proposed a service-oriented management architecture in the management function of 5G (5th Generation Mobile Communication Technology), which can support the management of multiple objects such as virtualized resources, single network element functional characteristics, sub-network functions, slice subnets, and tenant-oriented slices.
  • the present disclosure provides a model training method, a model testing method, a device and a storage medium, which are used to solve the problem of poor management effect when traditional methods are used to implement management tasks in 3GPP.
  • the present disclosure provides a model training method, which is applied to a first device of a management service producer, comprising: receiving a training request of a second device of a management service consumer, the training request carrying training attributes, the training attributes including: training indication information, the training indication information being used to indicate an association relationship between multiple models and a model generation strategy; configuring the training attributes in the first device, and jointly training multiple models according to the training indication information and multiple model identifiers to obtain a training
  • the training result includes a plurality of model identifiers contained in the training attributes or the plurality of model identifiers are generated by the first device, and the training result includes a fusion model after training.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the present disclosure provides a model testing method, which is applied to a third device of a management service producer, comprising: receiving a test request from a fourth device of a management service consumer, the test request carrying test attributes, the test attributes including: test data and test indication information, the test indication information being used to indicate the association relationship between multiple models and the model generation strategy; according to the test request, obtaining a fusion model requested for testing; based on the test indication information, using the test data to test the fusion model to obtain a test result, the test result including the performance of the fusion model; and sending the test result to the fourth device.
  • the test indication information includes: indicating whether to test the fusion model, at least one of the association information between multiple models, the multi-model fusion algorithm, the fusion model generation strategy, and the fusion model integration method.
  • the present disclosure provides a model training method, which is applied to a second device of a management service consumer, including: generating a training request, the training request carries training attributes, the training attributes include: training indication information, the training indication information is used to indicate the association relationship between multiple models and the model generation strategy when multiple models corresponding to multiple model identifiers are trained, the multiple model identifiers are included in the training attributes or the multiple model identifiers are generated by the first device; sending a training request to the first device of the management service producer.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: model initialization method, number of model training rounds, loss function corresponding to the model, and model At least one item in the network structure.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the present disclosure provides a model testing method, which is applied to a fourth device of a management service consumer.
  • the model testing method includes: generating a test request, the test request carries test attributes, the test attributes include: test data and test indication information, the test indication information is used to indicate the association relationship between multiple models and the model generation strategy; sending a test request to a third device of a management service producer, the test request is used to instruct the third device to obtain a fusion model for the requested test; receiving a test result sent by the third device, the test result includes the performance of the fusion model, and the test result is obtained by the third device testing the fusion model based on the test indication information using the test data.
  • the test indication information includes: indicating whether to test the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and at least one of the fusion model integration methods.
  • the present disclosure provides a model training device, which is applied to a first device of a management service producer, including a memory, a transceiver, and a processor:
  • transceiver for transmitting and receiving data under the control of the processor
  • a processor is used to read the computer program in the memory and perform the following operations:
  • Training request from a second device of a management service consumer, the training request carrying training attributes, the training attributes including: training indication information, the training indication information is used to indicate an association relationship between multiple models and a model generation strategy;
  • the training attributes are configured in the first device, and the multiple models are jointly trained according to the training indication information and the multiple model identifiers to obtain the training results.
  • the multiple model identifiers are included in the training attributes or the multiple model identifiers are generated by the first device.
  • the training results include the fusion model that has completed the training.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the present disclosure provides a model testing device, which is applied to a third device of a management service producer, including a memory, a transceiver, and a processor:
  • transceiver for transmitting and receiving data under the control of the processor
  • a processor is used to read the computer program in the memory and perform the following operations:
  • test attributes include: test data and test indication information, the test indication information is used to indicate the association relationship between multiple models and the model generation strategy; according to the test request, obtain the fusion model requested for test; based on the test indication information, use the test data to test the fusion model to obtain the test result, the test result includes the performance of the fusion model; send the test result to the fourth device.
  • the test indication information includes: indicating whether to test the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and at least one of the fusion model integration methods.
  • the present disclosure provides a model training device, applied to a second device for managing a service consumer, comprising:
  • a generating unit configured to generate a training request, the training request carrying training attributes, the training attributes including: training indication information, the training indication information being used to indicate the association relationship between the multiple models and the model generation strategy when the multiple models corresponding to the multiple model identifiers are trained, the multiple model identifiers being included in the training attributes or the multiple model identifiers being generated by the first device;
  • the sending unit is used to send a training request to the first device of the management service producer.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the present disclosure provides a model testing device for managing service consumers.
  • the fourth device comprises:
  • a generating unit used for generating a test request, the test request carries a test attribute, the test attribute includes: test data and test indication information, the test indication information is used for indicating the association relationship between multiple models and the model generation strategy;
  • a sending unit used to send a test request to a third device of the management service producer, where the test request is used to instruct the third device to obtain a fusion model of the requested test;
  • the receiving unit is used to receive a test result sent by a third device, where the test result includes the performance of the fusion model, and the test result is obtained by the third device testing the fusion model using test data based on the test indication information.
  • the test indication information includes: indicating whether to test the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and at least one of the fusion model integration methods.
  • the present disclosure provides a processor-readable storage medium storing a computer program, wherein the computer program is used to enable the processor to execute the model training method of the first aspect or the third aspect, or the model testing method of the second aspect or the fourth aspect.
  • the present disclosure provides a computer program product comprising instructions, which, when executed on a computer, causes the computer to execute the model training method of the first or third aspect, or the model testing method of the second or fourth aspect as described above.
  • the present disclosure provides a communication system, comprising any of the above-mentioned first devices and any of the above-mentioned second devices, or comprising any of the above-mentioned third devices or fourth devices.
  • a training request of a second device of a management service consumer is received, the training request carries training attributes, and the training attributes include: training indication information, the training indication information is used to indicate the association relationship between multiple models and the model generation strategy; the training attributes are configured in a first device, and according to the training indication information and multiple model identifiers, multiple models are jointly trained to obtain training results, the multiple model identifiers are included in the training attributes or the multiple model identifiers are generated by the first device, and the training results include a fusion model that has been trained, which can realize joint training of multiple models, obtain better model performance, enhance the intelligence of network operation and maintenance, and thus improve the management capabilities of devices in 3GPP.
  • FIG1 is a schematic diagram of an application scenario of a model training method provided by an embodiment of the present disclosure
  • FIG2 is a flow chart of a model training method provided by an embodiment of the present disclosure.
  • FIG3 is a flow chart of a model training method provided by another embodiment of the present disclosure.
  • FIG4 is a schematic diagram of an application scenario of a model testing method provided by an embodiment of the present disclosure.
  • FIG5 is a flow chart of a model testing method provided by an embodiment of the present disclosure.
  • FIG6 is a flow chart of a model testing method provided by another embodiment of the present disclosure.
  • FIG7 is a schematic diagram of the structure of a model training device provided by an embodiment of the present disclosure.
  • FIG8 is a schematic structural diagram of a model testing device provided by an embodiment of the present disclosure.
  • FIG9 is a schematic diagram of the structure of a model training device provided by another embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a model testing device provided in another embodiment of the present disclosure.
  • At least one means one or more, and “plurality” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that three relationships may exist.
  • a and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
  • the character “/” generally indicates that the previous and next associated objects are in an “or” relationship.
  • “At least one of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items.
  • At least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or plural.
  • the technical solution provided by the embodiments of the present disclosure can be applicable to a variety of systems, especially 5G systems.
  • the applicable systems can be global system of mobile communication (GSM) system, code division multiple access (CDMA) system, wideband code division multiple access (WCDMA) general packet radio service (GPRS) system, long term evolution (LTE) system, LTE frequency division duplex (FDD) system, LTE time division duplex (TDD) system, long term evolution advanced (LTE-A) system, universal mobile telecommunication system (UMTS), worldwide interoperability for microwave access (WiMAX) system, 5G new radio (NR) system, etc.
  • GSM global system of mobile communication
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • GPRS general packet radio service
  • LTE long term evolution
  • FDD LTE frequency division duplex
  • TDD LTE time division duplex
  • LTE-A long term evolution advanced
  • UMTS universal mobile telecommunication system
  • WiMAX worldwide interoperability for
  • the second device and the fourth device involved in the embodiment of the present disclosure may be a terminal device, specifically a device that provides voice and/or data connectivity to a user, a handheld device with a wireless connection function, or other processing devices connected to a wireless modem.
  • the name of the terminal device may also be different.
  • the terminal device may be called a user equipment (UE).
  • UE user equipment
  • a wireless terminal device may communicate with one or more core networks (CN) via a radio access network (RAN).
  • CN core networks
  • RAN radio access network
  • the wireless terminal device may be a mobile terminal device, such as a mobile phone (or a "cellular" phone) and a computer with a mobile terminal device.
  • a wireless terminal device may also be referred to as a system, a subscriber unit, a subscriber station, a mobile station, a mobile station, a remote station, a receiver, or a receiver. Access point, remote terminal, access terminal, user terminal, user agent, and user device are not limited in the embodiments of the present disclosure.
  • the first device/third device involved in the embodiments of the present disclosure is a network device, for example, it can be a base station, and the base station can include multiple cells that provide services for terminal devices.
  • the base station can also be called an access point, or it can be a device in the access network that communicates with the wireless terminal device through one or more sectors on the air interface, or other names.
  • the network device can be used to interchange received air frames with Internet Protocol (IP) packets, and act as a router between the wireless terminal device and the rest of the access network, where the rest of the access network may include an Internet Protocol (IP) communication network.
  • IP Internet Protocol
  • the network device can also coordinate the attribute management of the air interface.
  • the network device involved in the embodiments of the present disclosure may be a network device (Base Transceiver Station, BTS) in the Global System for Mobile communications (GSM) or Code Division Multiple Access (CDMA), or a network device (NodeB) in Wide-band Code Division Multiple Access (WCDMA), or an evolved network device (evolutional Node B, eNB or e-NodeB) in the Long Term Evolution (LTE) system, a 5G base station (gNB) in the 5G network architecture (next generation system), or a Home evolved Node B (HeNB), a relay node, a home base station (femto), a pico base station (pico), etc., but is not limited in the embodiments of the present disclosure.
  • network devices may include centralized unit (CU) nodes and distributed unit (DU) nodes, and the centralized unit and the distributed unit may also be geographically separated.
  • Network devices and terminal devices can each use one or more antennas for multiple input multiple output (MIMO) transmission.
  • MIMO transmission can be single user MIMO (SU-MIMO) or multi-user MIMO (MU-MIMO).
  • MIMO transmission can be 2D-MIMO, 3D-MIMO, FD-MIMO or massive-MIMO, or it can be diversity transmission, precoded transmission or beamforming transmission, etc.
  • AL/ML entity refers to any entity that is an AI/ML model or contains an AI/ML model and can be managed as a single composite entity.
  • Management Service The management of telecommunication network standards has always adopted a management architecture model that combines NM (Network Management) and EM (Element Management), where EM implements the management function of a single equipment manufacturer and NM implements the multi-vendor management function at the operator level.
  • NM and EM are interconnected through the standardized northbound interface Itf-N, so that NM can manage and monitor the networks of multiple manufacturers.
  • 3GPP proposed a service-oriented management architecture for 5G management functions, which can support the management of multiple objects such as virtualized resources, single network element functional characteristics, sub-network functions and slice subnets, and tenant-oriented slices.
  • the service-oriented architecture is composed of management functions and management services.
  • the carrier of management services is the management function, and one management function can provide multiple management services.
  • the service-oriented architecture will define management services and interfaces for accessing management services, as well as possible consumers and producers of management services.
  • Management services have three related components: service management interface, service management object model, and service management data.
  • AI/ML models are being used in more and more areas of 5G to achieve management capabilities and/or orchestration capabilities.
  • AI/ML model training in eMDA (management data analysis) work has been studied and is being processed in 3GPP specification work. Due to the complexity of communication systems, for some predictive modeling problems, the structure of the problem itself may suggest the use of multiple models. However, current model training and testing only considers the scenario of single model prediction and analysis, and does not consider the situation of combining prediction and analysis of multiple contributing models.
  • the configured parameters are as shown in Table 1, wherein these configured parameters come from the training request of the second device of the management service producer and are used to indicate the training of a single model.
  • M in Table 1 means required configuration, O means optional, T means yes, and F means no.
  • model identifier in Table 1 that is, only a single model can be trained in the current standard, but the training of a single model will lead to the problem of poor service management capabilities.
  • each training request can indicate an "expected runtime context", and the attribute "expected runtime context” describes the specific conditions for which the AI/ML entity should be trained.
  • 3GPP has defined some possible specific conditions, the standard is not perfect here, and it is impossible to realize the scenario of joint training and testing of multiple models.
  • an AI/ML-enabled function such as a RAN (radio access network) side reasoning function
  • RAN radio access network
  • an AI/ML-enabled function may contain multiple AI/ML entities, one of which can only implement a specific function, and the analysis output of one AI/ML entity may be used as the input of the next AI/ML entity.
  • an ordered chain of models such as a regression or classification model, can be trained and created, and the prediction of the first model outputs a first output target value, which can be used as part of the input of the second model in the model chain to predict the second output target value of the second model, and so on.
  • the output of the Inter-gNB (next generation base station) beam selection optimization model (the first model) can be used as the input of the handover optimization analysis model (the second model), and the output of the handover optimization analysis model is used as the final output.
  • the output results of multiple models are used for voting or averaging as the final output.
  • the existing standards are only for the training, testing and verification of a single AI/ML entity.
  • the model training and model testing methods in the existing standards cannot achieve better model performance and generalization performance, which in turn affects the management capabilities of equipment in 3GPP.
  • the embodiments of the present disclosure provide a model training method, a model testing method, an apparatus and a storage medium.
  • this method by adding training indication information to the training attributes, it is possible to indicate the joint training of multiple models and obtain a fusion model, thereby improving the management capabilities of the equipment in 3GPP.
  • the method and the device are based on the same application concept. Since the method and the device solve the problem in a similar principle, the implementation of the device and the method can refer to each other, and the repeated parts will not be repeated.
  • FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the present disclosure.
  • a first device 11 for managing service producers and a second device 12 for managing service consumers are included, wherein the AI/ML entity (model) is trained in the first device 11, the first device 11 receives a training request sent by the second device 12, returns a response to the second device 12, and then trains the fusion model based on the training request, and after the training is completed, sends the training result to the second device 12.
  • the AI/ML entity model
  • FIG. 2 is a flow chart of a model training method provided by an embodiment of the present disclosure, wherein the method manages a first device of a service producer, specifically, the first device includes a processing circuit coupled to a memory, and the processing circuit is configured to perform the following steps to implement MDA (Management Data Analytics) capabilities.
  • MDA Management Data Analytics
  • the model training method of this embodiment may include:
  • S201 Receive a training request from a second device of a management service consumer.
  • the training request carries training attributes, which include training indication information, and the training indication information is used to indicate the association relationship between multiple models and the model generation strategy.
  • multiple model identifiers are included in the training attributes.
  • the embodiment of the present disclosure mainly improves the model identifier in Table 1 into multiple model identifiers (at least two model identifiers), and adds training indication information to instruct the first device to jointly train multiple models to obtain a fusion model.
  • the training indication information is used to enhance the training effect of the fusion model.
  • Table 2 is as follows:
  • the plurality of model identifiers are generated by the first device.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the training instruction information may also include a fusion identifier, that is, an identifier of the fusion model obtained by configuration training.
  • the model identifier may also be referred to as an entity identifier, and each entity is a model.
  • Each AI/ML entity contains a model, corresponding to a model identifier, and the training The resulting fusion model includes multiple AI/ML entities and has a fusion identifier.
  • the association information between multiple models is used to indicate the order of training the multiple models and the algorithm used by each model.
  • the multi-model fusion algorithm is specifically an algorithm list, including multiple algorithms, such as Bagging (guided aggregation algorithm), Stacking (stacking algorithm), Adaboost (iterative algorithm), random forest, etc.
  • the association information between multiple models includes the correspondence between the model identification list and the algorithm list, for example, model identification A corresponds to Bagging, model identification B corresponds to Stacking, model identification C corresponds to Adaboost, and model identification D corresponds to Random Forest.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the model initialization method includes: different initialization methods for each model at each training. For example, three models, model A, model B and model C, are jointly trained. Three sets of training data are used for model A. The first set of training data is used to train model A using the first initialization method, the second set of training data is used to train model A using the second initialization method, and the third set of training data is used to train model A using the third initialization method.
  • initialization is to initialize the model parameters using an initialization function.
  • the number of model training rounds refers to the number of rounds of training for each set of data, such as for model A, the first set of training data is used for 10 rounds of training, the second set of training data is used for 10 rounds of training, and the third set of training data is used for 10 rounds of training.
  • the loss function corresponding to the model refers to the loss function used by each model during the training process, such as the loss function loss function a used by model A, the loss function loss function b used by model B, and the loss function loss function c used by model C.
  • the network structure is such as the number of network layers.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the weighted average method refers to performing a weighted average calculation on the output of each model as the output of the fusion model.
  • the fusion model obtained by training includes model A, model B, and model C.
  • the first training data is input into model A, model B, and model C respectively, and corresponding prediction results x, prediction results y, and prediction results z are obtained respectively.
  • the result obtained by weighted averaging the prediction results x, prediction results y, and prediction results z is the final output result of the fusion model.
  • the direct average method is to calculate the sum of the prediction results x, prediction results y, and prediction results z, and then divide it by 3, which is the final output result of the fusion model.
  • the voting method refers to selecting the maximum value, minimum value, or middle value among the prediction results x, prediction results y, and prediction results z as the final output result of the fusion model.
  • the learning method refers to adding the prediction results x, prediction results y, and prediction results z.
  • the prediction result z is input into another learning model D, and the result output by D is used as the final output of the fusion model.
  • the above training attributes are all sent from the second device to the first device.
  • the second device is such as a terminal device.
  • a person can use the terminal device to send a training request to the first device to instruct the first device to train the fusion model.
  • S202 Configure training attributes in the first device, and perform joint training on multiple models according to the training instruction information and multiple model identifiers to obtain training results.
  • the training attributes are configured in the service management interface of the first device.
  • the model training in the present disclosure is performed in the first device, and there may be network element management, such as OMC-R (network element) that manages the base station or a higher level of network management to provide the service, or the management service may be combined with the network element.
  • network element management such as OMC-R (network element) that manages the base station or a higher level of network management to provide the service
  • the management service may be combined with the network element.
  • OMC-R network element management
  • the training of the AI/ML model on the RAN side requires OAM (Operation Administration and Maintenance) to provide training service support.
  • OAM can implement management functions and can be set in the base station or outside the base station.
  • the management service can obtain the original network data of the managed network and service from the network functions of the managed network and service, and then train the fusion model.
  • one or more second devices may send a training request to the first device, and the training request needs to carry the above-mentioned training attributes. Then the first device determines whether the training requirements of the training attributes are met based on its own resources and analysis of the training attributes. If the fusion model can be trained, the training request of the second device is responded to as training can be performed. If the fusion model cannot be trained, the training request of the second device is responded to as training cannot be performed.
  • the training attributes are configured in the interface of the management service corresponding to the first device to train the fusion model.
  • the training data source configured in the training attributes
  • data samples are obtained, and the data samples are divided into a training set and a test set.
  • the training set is used to train the fusion model
  • the test set is used to test the trained fusion model.
  • the training of the fusion model can be indicated in the training attributes, and the corresponding algorithm multi-model fusion algorithm, model data source, fusion model generation strategy, and fusion model integration method can be selected to start the training of the fusion model.
  • the training results include the trained fusion model.
  • the training results also include the training report of the fusion model, which includes the performance of the fusion model, such as confidence, which is used to indicate the confidence of the AI/ML model when reasoning about data with the same distribution as the training data.
  • all relevant training processes corresponding to the multi-model joint training can be indexed through the fusion identifier.
  • the training report can be sent to the second device so that the second device can learn the training result of the fusion model.
  • Table 3 the content of the training report is shown, where CM (Conditional Mandatory) indicates mandatory under certain conditions.
  • the trained fusion model can be configured in network element devices, such as base stations, switches, and routers to implement the management capabilities of the corresponding network element devices.
  • network data of the managed network and services can be obtained from the network functions (NF) of the managed network and services, and the network data can be used as training data.
  • NF network functions
  • different multi-model fusion algorithms and strategies are selected to perform joint training of the fusion model to obtain a fusion model, wherein the fusion model can improve the service capability of the corresponding equipment.
  • FIG. 3 is a flow chart of a model training method provided by another embodiment of the present disclosure, the method is applied to a second device of a management service consumer, and specifically includes the following steps:
  • the training request carries training attributes
  • the training attributes include: training indication information
  • the training indication information is used to indicate the association relationship between multiple models and the model generation strategy when multiple models corresponding to multiple model identifiers are trained, and the multiple model identifiers are included in the training attributes or the multiple model identifiers are generated by the first device.
  • S302 Send a training request to a first device of a management service producer.
  • sending a training request to the first device of the management service producer can instruct the first device to train the fusion model based on the training request.
  • the implementation principle and technical effects of the model training method applied to the second device of the management service consumer can refer to the aforementioned embodiments and will not be repeated here.
  • the trained fusion model needs to be tested, referring to FIG4 , including a third device 41 for managing service producers and a fourth device 42 for managing service consumers, wherein the AI/ML entity (model) is tested in the third device 41, the third device 41 receives the test request sent by the fourth device 42, returns a response to the fourth device 42, and then tests the fusion model based on the test request. After the test is completed, the test result is sent to the fourth device 42.
  • the third device is a network device different from the first device, the first device is a device for training the fusion model, the third device is a device that can use the fusion model, and the fourth device and the second device may be the same or different.
  • the method is applied to a third device of a management service producer, specifically, the third device includes a processing circuit coupled to a memory, and the processing circuit is configured to perform the following steps to implement MDA (Management Data Analytics) capabilities.
  • MDA Management Data Analytics
  • the model testing method of this embodiment may include:
  • S501 Receive a test request from a fourth device of a management service consumer.
  • the test request carries test attributes, which include test data and test indication information, and the test indication information is used to indicate the association relationship between multiple models and the model generation strategy. Specifically.
  • the test indication information includes: at least one of whether to test the fusion model, the association information between multiple models, the multi-model fusion algorithm, the fusion model generation strategy, and the fusion model integration method.
  • the correlation information between multiple models, the multi-model fusion algorithm, the fusion model generation strategy, and the fusion model integration method are specific contents of the above-mentioned training process and will not be repeated here.
  • the test data may also be a test data set obtained by the third device from a data source, and the data sent by the fourth device and having the same feature distribution as the test data set is used as the test data of the fusion model.
  • the test request includes: a fusion identifier of the fusion model, and then the fusion model to be tested can be determined according to the test request.
  • the fusion model can be trained by adopting the above-mentioned model training method.
  • the third device determines whether the test requirements of the test attributes are met based on its own resources and analysis of the training attributes. If the fusion model can be tested, the third device responds to the test request of the fourth device as being able to test. If the fusion model cannot be tested, the third device responds to the training request of the fourth device as being unable to test.
  • the present disclosure configures the test attributes in the management service interface of the third device to facilitate testing of the fusion model.
  • the test indication information indicates the multi-model fusion algorithm, fusion model generation strategy, fusion model integration method, etc. of multiple models. Therefore, the test data can be input into the fusion model, and the fusion model processes the test data according to the test indication information to obtain the test results.
  • the third device can test the fusion model, relevant test indication information is selected, the test data is input into the fusion model, the prediction result output by the fusion model is obtained, and the loss value of the prediction result and the true value result is calculated using a preset loss function, thereby determining the performance of the fusion model.
  • test result may be generated, where the test result includes the performance of the fusion model, for example, a recall rate, a receiver operating characteristic curve (ROC-AUC) of the classification model, or a confidence level.
  • recall rate for example, a recall rate, a receiver operating characteristic curve (ROC-AUC) of the classification model, or a confidence level.
  • ROC-AUC receiver operating characteristic curve
  • S504 Send the test result to the fourth device.
  • the integrated model is tested according to the fusion algorithm and strategy to obtain the test results, and the evaluation index of the model is obtained through evaluation methods such as performance index and confidence.
  • the present disclosure can put the fusion model whose test results indicate that it has passed the test online to the corresponding network element device for use.
  • the present disclosure can realize the fusion model based on multi-model joint training in the communication system.
  • the prediction and generalization capabilities of the model can be significantly improved, thereby achieving better network performance in network intelligence and network autonomy systems.
  • Figure 6 is a flow chart of a model testing method provided by another embodiment of the present disclosure, the method is applied to a fourth device for managing service consumers.
  • the model testing method of this embodiment may include:
  • S601 Generate a test request.
  • the test request carries test attributes, which include test data and test indication information.
  • the test indication information is used to indicate the association relationship between multiple models and the model generation strategy.
  • the test indication information includes: at least one of indicating whether to test the fusion model, the association information between multiple models, the multi-model fusion algorithm, the fusion model generation strategy, and the fusion model integration method.
  • S602 Send a test request to a third device of the management service producer.
  • the test request is used to instruct the third device to obtain the fusion model of the requested test.
  • S603 Receive a test result sent by a third device.
  • the test result includes the performance of the fusion model, and the test result is obtained by the third device testing the fusion model using test data based on the test indication information.
  • the embodiment of the present disclosure provides a model training device, and the model training device of this embodiment may be a first device.
  • the model training device may include a transceiver 701 , a processor 702 , and a memory 703 .
  • the transceiver 701 is used to receive and send data under the control of the processor 702 .
  • the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by processor 702 and various circuits of memory represented by memory 703 are linked together.
  • the bus architecture can also link together various other circuits such as peripherals, voltage regulators, and power management circuits, which are all well known in the art and are therefore not further described herein.
  • the bus interface provides an interface.
  • the transceiver 701 may be a plurality of components, namely, a transmitter and a receiver, providing units for communicating with various other devices on a transmission medium, and these transmission media include wireless channels, wired channels, optical cables and other transmission media.
  • the model training device may further include a user interface 704.
  • the user interface 704 may also be an interface that can be connected to external or internal devices.
  • the connected devices include but are not limited to the devices that need to be connected. Limited to keypad, display, speakers, microphone, joystick, etc.
  • the processor 702 is responsible for managing the bus architecture and general processing, and the memory 703 can store data used by the processor 702 when performing operations.
  • processor 702 can be a central processing unit (CPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or a complex programmable logic device (CPLD), and processor 702 can also adopt a multi-core architecture.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • CPLD complex programmable logic device
  • the processor 702 is used to execute any method related to the first device provided by the embodiment of the present disclosure according to the obtained executable instructions by calling the computer program stored in the memory 703.
  • the processor and the memory can also be arranged physically separately.
  • the processor 702 is used to perform the following operations: receiving a training request from a second device that manages a service consumer, the training request carries training attributes, the training attributes include: training indication information, the training indication information is used to indicate the association relationship between multiple models and the model generation strategy; configuring the training attributes in the first device, and according to the training indication information and multiple model identifiers, jointly training the multiple models to obtain training results, the multiple model identifiers are included in the training attributes or the multiple model identifiers are generated by the first device, and the training results include a fusion model that has completed the training.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the present disclosure embodiment provides a model testing device, and the model training device of this embodiment can be a third device.
  • the model testing device can include a transceiver 801, a processor Processor 802 and memory 803.
  • the transceiver 801 is used to receive and send data under the control of the processor 802 .
  • the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by processor 802 and various circuits of memory represented by memory 803 are linked together.
  • the bus architecture may also link together various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art and are therefore not further described herein.
  • the bus interface provides an interface.
  • the transceiver 801 may be a plurality of components, namely, a transmitter and a receiver, providing a unit for communicating with various other devices on a transmission medium, which transmission medium includes a wireless channel, a wired channel, an optical cable, and other transmission media.
  • the processor 802 is responsible for managing the bus architecture and general processing, and the memory 803 may store data used by the processor 802 when performing operations.
  • the processor 802 may be a CPU, an ASIC, an FPGA or a CPLD, and the processor may also adopt a multi-core architecture.
  • the processor 802 is used to execute any method related to the network device provided by the embodiment of the present disclosure according to the obtained executable instructions by calling the computer program stored in the memory 803.
  • the processor and the memory can also be arranged physically separately.
  • the processor 802 is used to perform the following operations: receiving a test request from a fourth device of a management service consumer, the test request carrying test attributes, the test attributes including: test data and test indication information, the test indication information being used to indicate the association relationship between multiple models and the model generation strategy; obtaining, according to the test request, the fusion model requested for testing; based on the test indication information, using the test data to test the fusion model to obtain a test result, the test result including the performance of the fusion model; and sending the test result to the fourth device.
  • the test indication information includes: indicating whether to test the fusion model, at least one of the association information between multiple models, the multi-model fusion algorithm, the fusion model generation strategy, and the fusion model integration method.
  • the embodiment of the present disclosure provides a model training device, and the model training device of this embodiment can be a second device.
  • the model training device can include: a generating unit 901 and a sending unit 902 .
  • a generating unit 901 is used to generate a training request, where the training request carries a training attribute, where the training attribute includes: training indication information, where the training indication information is used to indicate the association relationship between the multiple models and the model generation strategy when multiple models corresponding to the multiple model identifiers are trained, where the multiple model identifiers are included in the training attribute or the multiple model identifiers are generated by the first device;
  • the sending unit 902 is used to send a training request to the first device of the management service producer.
  • the training indication information includes: at least one of indicating whether to train the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and a fusion model integration method.
  • the fusion model generation strategy includes: at least one of: a model initialization method, a number of model training rounds, a loss function corresponding to the model, and a network structure of the model.
  • the fusion model integration method includes: at least one of a weighted or direct average method, a voting method, and a learning method.
  • the embodiment of the present disclosure further provides a model testing device, and the model testing device of this embodiment can be a fourth device.
  • the model training device includes: a generating unit 1001 , a sending unit 1002 , and a receiving unit 1003 .
  • a generating unit 1001 is used to generate a test request, the test request carries test attributes, the test attributes include: test data and test indication information, the test indication information is used to indicate the association relationship between multiple models and the model generation strategy;
  • a sending unit 1002 is used to send a test request to a third device of the management service producer, where the test request is used to instruct the third device to obtain a fusion model for the requested test;
  • the receiving unit 1003 is used to receive a test result sent by a third device, where the test result includes the performance of the fusion model, and the test result is obtained by the third device testing the fusion model using test data based on the test indication information.
  • the test indication information includes: indicating whether to test the fusion model, association information between multiple models, a multi-model fusion algorithm, a fusion model generation strategy, and at least one of the fusion model integration methods.
  • each functional unit in each embodiment of the present disclosure may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a processor-readable storage medium.
  • the computer software product is stored in a storage medium, including several instructions to enable a computer device (which can be a personal computer, server, or network device, etc.) or a processor (processor) to perform all or part of the steps of the various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., and other media that can store program codes.
  • the embodiment of the present disclosure provides a processor-readable storage medium, which stores a computer program, and the computer program is used to enable the processor to execute any method related to the first device to the fourth device provided in the embodiment of the present disclosure.
  • the processor is enabled to implement all the method steps implemented by the first device to the fourth device in the above method embodiment, and can achieve the same technical effect, and the parts and beneficial effects that are the same as those in the method embodiment in this embodiment will not be specifically described here.
  • the processor-readable storage medium can be any available medium or data storage device that can be accessed by the processor, including but not limited to magnetic storage (such as floppy disks, hard disks, magnetic tapes, magneto-optical disks (MO), etc.), optical storage (such as CD, DVD, BD, HVD, etc.), and semiconductor storage (such as ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)), etc.
  • magnetic storage such as floppy disks, hard disks, magnetic tapes, magneto-optical disks (MO), etc.
  • optical storage such as CD, DVD, BD, HVD, etc.
  • semiconductor storage such as ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)
  • the embodiments of the present disclosure may be provided as methods, systems, or computer program products. Therefore, the present disclosure may take the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) containing computer-usable program codes.
  • each process and/or box in the flowchart and/or block diagram, as well as the combination of the processes and/or boxes in the flowchart and/or block diagram, can be implemented by computer executable instructions.
  • These computer executable instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the functions specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.
  • processor-executable instructions may also be stored in a processor-readable memory that can direct a computer or other programmable data processing device to operate in a specific manner, so that the instructions stored in the processor-readable memory produce a product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • processor-executable instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本公开提供一种模型训练方法、模型测试方法、装置及存储介质,该模型训练方法包括:接收管理服务消费者的第二设备的训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训练结果,训练结果包括训练完成的融合模型。本公开能够实现多个模型的联合训练,得到更好的模型性能,增强网络运行和维护的智能化程度,进而能够提高3GPP中设备的管理能力。

Description

模型训练方法、模型测试方法、装置及存储介质
本公开要求于2022年11月04日提交中国专利局、申请号为202211378359.3、申请名称为“模型训练方法、模型测试方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及通信领域,尤其涉及一种模型训练方法、模型测试方法、装置及存储介质。
背景技术
3GPP(3rd Generation Partnership Project,第三代合作伙伴计划)在5G((5th Generation Mobile Communication Technology,第五代移动通信技术)管理功能中提出了服务化管理的架构,能够支持对虚拟化资源、单网元功能特性、子网络功能、切片子网以及面向租户的切片等多种对象进行管理的能力。
其中,在3GPP中是采用传统方式实现上述管理能力,存在管理效果差的问题。
发明内容
本公开提供一种模型训练方法、模型测试方法、装置及存储介质,用于解决目前在3GPP中采用传统方式实现管理任务时,存在管理效果差的问题。
第一方面,本公开提供一种模型训练方法,应用于管理服务生产者的第一设备,包括:接收管理服务消费者的第二设备的训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训 练结果,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的,训练结果包括训练完成的融合模型。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
第二方面,本公开提供一种模型测试方法,应用于管理服务生产者的第三设备,包括:接收管理服务消费者的第四设备的测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;根据测试请求,获取请求测试的融合模型;基于测试指示信息,采用测试数据对融合模型进行测试,得到测试结果,测试结果包括融合模型的性能;向第四设备发送测试结果。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
第三方面,本公开提供一种模型训练方法,应用于管理服务消费者的第二设备,包括:生成训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的;向管理服务生产者的第一设备发送训练请求。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的 网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
第四方面,本公开提供一种模型测试方法,应用于管理服务消费者的第四设备,模型测试方法包括:生成测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;向管理服务生产者的第三设备发送测试请求,测试请求用于指示第三设备获取请求测试的融合模型;接收第三设备发送的测试结果,测试结果包括融合模型的性能,测试结果是第三设备基于测试指示信息,采用测试数据对融合模型进行测试得到的。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
第五方面,本公开提供一种模型训练装置,应用于管理服务生产者的第一设备,包括存储器、收发机和处理器:
存储器,用于存储计算机程序;
收发机,用于在处理器的控制下收发数据;
处理器,用于读取存储器中的计算机程序并执行如下操作:
接收管理服务消费者的第二设备的训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;
在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训练结果,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的,训练结果包括训练完成的融合模型。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
第六方面,本公开提供一种模型测试装置,应用于管理服务生产者的第三设备,包括存储器、收发机和处理器:
存储器,用于存储计算机程序;
收发机,用于在处理器的控制下收发数据;
处理器,用于读取存储器中的计算机程序并执行如下操作:
接收管理服务消费者的第四设备的测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;根据测试请求,获取请求测试的融合模型;基于测试指示信息,采用测试数据对融合模型进行测试,得到测试结果,测试结果包括融合模型的性能;向第四设备发送测试结果。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
第七方面,本公开提供一种模型训练装置,应用于管理服务消费者的第二设备,包括:
生成单元,用于生成训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的;
发送单元,用于向管理服务生产者的第一设备发送训练请求。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
第八方面,本公开提供一种模型测试装置,应用于管理服务消费者的 第四设备,包括:
生成单元,用于生成测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;
发送单元,用于向管理服务生产者的第三设备发送测试请求,测试请求用于指示第三设备获取请求测试的融合模型;
接收单元,用于接收第三设备发送的测试结果,测试结果包括融合模型的性能,测试结果是第三设备基于测试指示信息,采用测试数据对融合模型进行测试得到的。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
第九方面,本公开提供一种处理器可读存储介质,处理器可读存储介质存储有计算机程序,计算机程序用于使处理器执行第一方面或第三方面的模型训练方法,或第二方面或第四方面的模型测试方法。
第十方面,本公开提供一种包含指令的计算机程序产品,当指令在计算机上运行时,使得计算机执行如上述第一方面或第三方面的模型训练方法,或第二方面或第四方面的模型测试方法。
第十一方面,本公开提供一种通信系统,包括上述任一的第一设备和上述任一的第二设备,或者包括上述任一的第三设备或第四设备。
根据本公开提供的模型训练方法、模型测试方法、装置及存储介质,通过接收管理服务消费者的第二设备的训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训练结果,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的,训练结果包括训练完成的融合模型,能够实现多个模型的联合训练,得到更好的模型性能,增强网络运行和维护的智能化程度,进而能够提高3GPP中设备的管理能力。
应当理解,上述发明内容部分中所描述的内容并非旨在限定本公开的实施例的关键或重要特征,亦非用于限制本公开的范围。本公开的其它特 征将通过以下的描述变得容易理解。
附图说明
为了更清楚地说明本公开或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开一实施例提供的模型训练方法的应用场景示意图;
图2为本公开一实施例提供的模型训练方法的流程图;
图3为本公开另一实施例提供的模型训练方法的流程图;
图4为本公开一实施例提供的模型测试方法的应用场景示意图;
图5为本公开一实施例提供的模型测试方法的流程图;
图6为本公开另一实施例提供的模型测试方法的流程图;
图7为本公开一实施例提供的模型训练装置的结构示意图;
图8为本公开一实施例提供的模型测试装置的结构示意图;
图9为本公开另一实施例提供的模型训练装置的结构示意图;
图10为本公开另一实施例提供的模型测试装置的结构示意图。
具体实施方式
本公开中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
可以理解的,本公开实施例中的各步骤或操作仅是示例,本公开实施例还可以执行其它操作或者各种操作的变形。此外,各个步骤可以按照本公开实施例呈现的不同的顺序来执行,并且有可能并非要执行本公开实施例中的 全部操作。
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,并不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开实施例提供的技术方案可以适用于多种系统,尤其是5G系统。例如适用的系统可以是全球移动通讯(global system of mobile communication,GSM)系统、码分多址(code division multiple access,CDMA)系统、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)通用分组无线业务(general packet radio service,GPRS)系统、长期演进(long term evolution,LTE)系统、LTE频分双工(frequency division duplex,FDD)系统、LTE时分双工(time division duplex,TDD)系统、高级长期演进(long term evolution advanced,LTE-A)系统、通用移动系统(universal mobile telecommunication system,UMTS)、全球互联微波接入(worldwide interoperability for microwave access,WiMAX)系统、5G新空口(New Radio,NR)系统等。这多种系统中包括终端设备,或网络设备。系统中还可以包括核心网部分,例如演进的分组系统(Evloved Packet System,EPS)、5G系统(5GS)等。
本公开实施例涉及的第二设备和第四设备可以是终端设备,具体可以是指向用户提供语音和/或数据连通性的设备,具有无线连接功能的手持式设备、或连接到无线调制解调器的其他处理设备等。在不同的系统中,终端设备的名称可能也不相同,例如在5G系统中,终端设备可以称为用户设备(User Equipment,UE)。无线终端设备可以经无线接入网(Radio Access Network,RAN)与一个或多个核心网(Core Network,CN)进行通信,无线终端设备可以是移动终端设备,如移动电话(或称为“蜂窝”电话)和具有移动终端设备的计算机,例如,可以是便携式、袖珍式、手持式、计算机内置的或者车载的移动装置,它们与无线接入网交换语言和/或数据。例如,个人通信业务(Personal Communication Service,PCS)电话、无绳电话、会话发起协议(Session Initiated Protocol,SIP)话机、无线本地环路(Wireless Local Loop,WLL)站、个人数字助理(Personal Digital Assistant,PDA)等设备。无线终端设备也可以称为系统、订户单元(subscriber unit)、订户站(subscriber station),移动站(mobile station)、移动台(mobile)、远程站(remote station)、接 入点(access point)、远程终端设备(remote terminal)、接入终端设备(access terminal)、用户终端设备(user terminal)、用户代理(user agent)、用户装置(user device),本公开实施例中并不限定。
本公开实施例涉及的第一设备/第三设备是网络设备,例如,可以是基站,该基站可以包括多个为终端设备提供服务的小区。根据具体应用场合不同,基站又可以称为接入点,或者可以是接入网中在空中接口上通过一个或多个扇区与无线终端设备通信的设备,或者其它名称。网络设备可用于将收到的空中帧与网际协议(Internet Protocol,IP)分组进行相互更换,作为无线终端设备与接入网的其余部分之间的路由器,其中接入网的其余部分可包括网际协议(IP)通信网络。网络设备还可协调对空中接口的属性管理。例如,本公开实施例涉及的网络设备可以是全球移动通信系统(Global System for Mobile communications,GSM)或码分多址接入(Code Division Multiple Access,CDMA)中的网络设备(Base Transceiver Station,BTS),也可以是带宽码分多址接入(Wide-band Code Division Multiple Access,WCDMA)中的网络设备(NodeB),还可以是长期演进(long term evolution,LTE)系统中的演进型网络设备(evolutional Node B,eNB或e-NodeB)、5G网络架构(next generation system)中的5G基站(gNB),也可以是家庭演进基站(Home evolved Node B,HeNB)、中继节点(relay node)、家庭基站(femto)、微微基站(pico)等,本公开实施例中并不限定。在一些网络结构中,网络设备可以包括集中单元(centralized unit,CU)节点和分布单元(distributed unit,DU)节点,集中单元和分布单元也可以地理上分开布置。
网络设备与终端设备之间可以各自使用一或多根天线进行多输入多输出(Multi Input Multi Output,MIMO)传输,MIMO传输可以是单用户MIMO(Single User MIMO,SU-MIMO)或多用户MIMO(Multiple User MIMO,MU-MIMO)。根据根天线组合的形态和数量,MIMO传输可以是2D-MIMO、3D-MIMO、FD-MIMO或massive-MIMO,也可以是分集传输或预编码传输或波束赋形传输等。
为更清楚地理解本方案,先分别对本方案涉及的AL/ML(人工智能/机器学习)实体、管理服务以及现有技术存在的问题进行简单描述,具体如下:
(一)AL/ML实体:是指任何属于AI/ML模型或包含AI/ML模型并可作为单一复合实体进行管理的实体。
(二)管理服务(Management Service,简称MS):电信网络标准的管理一直采用NM(Network Management,网络管理)-EM(Element Management,网元管理)组合的管理架构模式,其中EM实现单设备厂商的管理功能,NM实现运营商层面的多厂商管理功能。NM和EM之间通过标准化北向接口Itf-N进行互联互通,达到NM能够对多厂商的网络进行管理和监控的目的。3GPP在5G管理功能提出了服务化管理的架构,能够支持对虚拟化资源,单网元功能特性,子网络功能和切片子网,以及面向租户的切片等多种对象进行管理的能力。服务化架构由管理功能和管理服务组合而成。管理服务的载体是管理功能,一个管理功能可以提供多个管理服务。
其中,区别于以往的定义两两功能模块之间标准化接口的固定的标准化架构,服务化架构在标准上将定义管理服务以及访问管理服务的接口,同时定义对管理服务可能的消费者和管理服务的生产者。管理服务有三个相关组成部件:服务管理接口,服务管理对象模型和服务管理数据。
(三)技术问题
随着5G网络更加动态的需求变化和多样化业务的支持,网络越来越复杂,服务质量要求和运维成本越来越高,亟需通过将移动通信技术与包括人工智能在内的自动化/智能化技术相结合,以提升移动通信网络的智能化水平。
AI/ML模型正在越来越多的领域用于5G,以实现管理能力和/或编排能力。3GPP规范工作中已经研究并正在处理的eMDA(管理数据分析)工作中的AI/ML模型训练。对于由于通信系统的复杂性,对于某些预测建模问题,问题本身的结构可能建议使用多个模型。但是目前的模型训练和测试仅考虑单一模型预测和分析的场景,并没有考虑结合多个贡献模型的预测和分析的情况。
现有标准在管理服务生产者的第一设备中进行模型训练时,配置的参数如表1,其中,这些配置的参数来自于管理服务生产者的第二设备的训练请求,用于指示训练单一模型。
表1

其中,表1中的M表示必须配置的,O表示可选的,T表示是,F表示否。此外,在表1中模型标识只包括一个,即目前的标准中只能训练单一模型,但是训练的单一模型会导致服务化管理能力差的问题。进一步地,在表1中每个训练请求可以指示“预期运行时间上下文”,该属性“预期运行时间上下文”描述了AI/ML实体应该为其进行训练的特定条件。虽然3GPP定义了一些可能的特定条件,但是标准在此处并不完善,也无法实现多模型联合训练和测试的场景。
具体为,对于给定的用例,不同的AI/ML实体在AI/ML推理功能中应用各自的ML(机器学习)模型,以满足不同的推理需求和功能。然而,在某些情况下,一个AI/ML启用的功能,如RAN(radio access network,无线接入网)侧推理功能可能包含多个AI/ML实体,其中一个AI/ML实体只能实现某一特定功能,一个AI/ML实体的分析输出可能作为下一个AI/ML实体的输入。例如,可以训练和创建一个有序的模型链,如回归或者分类模型,第一个模型的预测输出了第一个输出目标值,这个输出目标值可作为模型链中第二个模型的输入的一部分,用来预测第二个模型的第二个输出目标值,以此类推。还例如,Inter-gNB(下一代基站)波束选择优化模型(为第一个模型)的输出可作为切换优化分析模型(第二个模型)的输入,切换优化分析模型的输出作为最终的输出。此外,还会有利用多个模型的输出结果进行投票亦或求取均值作为最终的输出的情况。
但是现有标准只是针对单个AI/ML实体的训练,测试和验证,当一个模型推断功能需要多个模型联合完成对应推断任务,现有标准中的模型训练和模型测试方法不能达到更佳的模型性能以及泛化性能,进而影响3GPP中设备的管理能力。
为解决上述问题,本公开实施例提供了一种模型训练方法、模型测试方法、装置及存储介质,该方法中,通过在训练属性中增加训练指示信息,能够实现指示多个模型的联合训练,得到融合模型,进而能够提高3GPP中设备的管理能力。
其中,方法和装置是基于同一申请构思的,由于方法和装置解决问题的原理相似,因此装置和方法的实施可以相互参见,重复之处不再赘述。
参考图1,图1为本公开实施例提供的应用场景示意图。如图1所示,包括管理服务生产者的第一设备11和管理服务消费者的第二设备12,其中,AI/ML实体(模型)在训练时在第一设备11中进行,第一设备11接收第二设备12发送的训练请求,向第二设备12返回响应后,基于训练请求进行融合模型的训练,训练结束后,将训练结果发送给第二设备12。
参考图2,图2为本公开一实施例提供的模型训练方法的流程示意图,该方法管理服务生产者的第一设备,具体为第一设备包括耦合到存储器的处理电路,该处理电路配置为执行以下步骤,以实现MDA(Management Data Analytics,管理数据分析)能力。如图2所示,本实施例的模型训练方法可以包括:
S201、接收管理服务消费者的第二设备的训练请求。
其中,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略。
一种实施例中,多个模型标识是包含在训练属性中,具体地,参照表1和表2,本公开实施例主要是将表1中的模型标识改进为多个模型标识(至少两个模型标识),并增加训练指示信息,以指示第一设备对多个模型进行联合训练以训练得到一融合模型。该训练指示信息用于增强融合模型的训练效果。表2如下:
表2

其中,表2中与表1中相同的属性名称,在3GPP标准中有对应的解释,在此不再赘述。此外,表2中的CO(Conditional Optional)表示某些条件下是可选的,若只训练单个AI/ML实体,则可以配置一个模型标识,对应的训练指示信息部分可以不进行配置。若训练融合模型,可配置多个模型标识,并配置对应的训练指示信息。
一种实施例中,多个模型标识是第一设备生成的。
进一步地,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
一种可选实施例中,训练指示信息还可以包括融合标识,即配置训练得到的融合模型的标识。在本公开实施例中模型标识也可以称为实体标识,每个实体为一个模型。每个AI/ML实体包含一个模型,对应有模型标识,训练 得到的融合模型包括多个AI/ML实体,具有融合标识。
在本公开实施例中,多个模型之间的关联信息用于指示多个模型训练的前后顺序,以及每个模型对应使用的算法。
其中,多模型融合算法具体为一算法列表,包括多个算法,如Bagging(引导聚集算法),Stacking(堆叠算法),Adaboost(迭代算法),随机森林等。
其中,多个模型之间的关联信息包括模型标识列表与算法列表对应,例如,模型标识A对应Bagging,模型标识B对应Stacking,模型标识C对应Adaboost,模型标识D对应随机森林。
进一步地,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
其中,模型初始化方式包括:每个模型的在每次训练时不同的初始化方法。例如,联合训练三个模型,模型A、模型B和模型C。针对模型A采用三组训练数据进行训练,采用第一组训练数据训练模型A采用第一种初始化方法,采用第二组训练数据训练模型A采用第二种初始化方法,采用第三组训练数据训练模型A采用第三种初始化方法。其中初始化是采用初始化函数对模型参数进行初始化。模型训练轮数是指每组数据训练的轮数,如针对模型A,采用第一组训练数据训练10轮,采用第二组训练数据训练10轮,采用第三组训练数据训练10轮。模型对应的损失函数是指训练过程中每个模型使用的损失函数,如模型A使用的损失函数损失函数a、模型B使用的损失函数损失函数b、模型C使用的损失函数损失函数c。网络结构如网络层数。
进一步地,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。其中,加权平均法是指将每个模型的输出进行加权平均计算作为融合模型的输出。例如,训练得到的融合模型包括模型A、模型B和模型C。将第一训练数据分别输入模型A、模型B和模型C,对应分别得到预测结果x、预测结果y和预测结果z。对预测结果x、预测结果y和预测结果z进行加权平均得到的结果为融合模型最后的输出结果。直接平均法是求预测结果x、预测结果y与预测结果z的和,然后除以3,为融合模型最后的输出结果。投票法是指在预测结果x、预测结果y和预测结果z选取最大值、最小值或中间值作为融合模型最后的输出结果。学习法是指将预测结果x、预测结果y 与预测结果z输入另一个学习模型D中,将D输出的结果作为融合模型的最后输出。在本公开实施例中,以上只是示例说明,本公开不限定融合模型的集成方式。
在本公开实施例中,以上训练属性均为第二设备发送给第一设备的,第二设备如终端设备,人员可以使用终端设备向第一设备发送训练请求,以指示第一设备对融合模型的训练。
S202、在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训练结果。
具体为,在第一设备的服务管理接口中配置训练属性。本公开中的模型训练是在第一设备中进行,可以有网元管理,如管理基站的OMC-R(网元)或更高一级的网络管理提供该服务,也可以使管理服务是和网元合设在一起,比如RAN侧的AI/ML模型的训练需要OAM(Operation Administration and Maintenance,操作维护管理)提供训练服务支持,OAM可以实现管理功能,可以设置在基站里也可以在基站外部。
在本公开实施例中,管理服务可以从被管理网络和服务的网络功能获得被管理网络和服务的原始网络数据,进而进行融合模型的训练。其中,可以是一个或多个第二设备向第一设备发送训练请求,在训练请求中需要携带上述训练属性。然后第一设备根据自身的资源以及训练属性的分析,确定是否满足训练属性的训练要求,若可以进行融合模型的训练,则响应第二设备的训练请求为可以进行训练,若不可以进行融合模型的训练,则响应第二设备的训练请求为无法进行训练。
进一步地,若第一设备可以对融合模型进行训练,则在第一设备对应的管理服务的接口中配置训练属性,以进行融合模型的训练。首先,通过训练属性中配置的训练数据源,获取数据样本,将数据样本分为训练集和测试集,然后采用训练集进行融合模型的训练,采用测试集进行训练完成的融合模型的测试。此外,可以通过训练属性中指示的是融合模型的训练,可选择对应的算法多模型融合算法、模型数据源,融合模型生成策略、融合模型集成方式,开始融合模型的训练。
其中,训练结果包括训练完成的融合模型。此外,训练结果还包括融合模型的训练报告,训练报告中包括融合模型的性能,如置信度,用于表示AI/ML模型对与训练数据分布相同的数据进行推理时的置信度。
在本公开实施例中,在分布式训练模式下,即存在多个了请求,则可以通过融合标识索引到对应多模型联合训练的所有相关训练过程。
进一步地,可以将训练报告发送给第二设备,以便于第二设备获知融合模型的训练结果。其中,参照表3,为训练报告包括的内容,其中,CM(Conditional Mandatory)表示某些条件下必选的。
表3
进一步地,在本公开实施例中,训练好的融合模型可以配置在网元设备中,如基站、交换机、路由器以实现对应网元设备的管理能力。
在本公开实施例中,可以从被管理网络和服务的网络功能(NF)获得被管理网络和服务的网络数据,可以将该网络数据作为训练数据,根据训练融合模型的训练指示信息,选取不同的多模型融合算法和策略进行融合模型的联合训练,得到融合模型,其中,融合模型能够提高相应设备的服务能力。
参考图3,图3为本公开另一实施例提供的模型训练方法的流程示意图,该方法应用于管理服务消费者的第二设备,具体包括以下步骤:
S301,生成训练请求。
其中,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的。
S302,向管理服务生产者的第一设备发送训练请求。
其中,向管理服务生产者的第一设备发送训练请求,可以指示第一设备基于该训练请求进行融合模型的训练。
在本公开实施例中,应用于管理服务消费者的第二设备的模型训练方法的实现原理和技术效果可参照前述实施例,在此不再赘述。
此外,需要对训练完成的融合模型进行测试,参照图4,包括管理服务生产者的第三设备41和管理服务消费者的第四设备42,其中,AI/ML实体(模型)在测试时在第三设备41中进行,第三设备41接收第四设备42发送的测试请求,向第四设备42返回响应后,基于测试请求进行融合模型的测试,测试结束后,将测试结果发送给第四设备42。在本公开实施例中,第三设备是和第一设备不同的网络设备,第一设备是用于融合模型训练的设备,第三设备是可以使用融合模型的设备,第四设备和第二设备可以相同也可以不同。
具体地,参照图5为本公开一实施例提供的模型测试方法的流程示意图,该方法应用于管理服务生产者的第三设备,具体为第三设备包括耦合到存储器的处理电路,该处理电路配置为执行以下步骤,以实现MDA(Management Data Analytics,管理数据分析)能力。如图5所示,本实施例的模型测试方法可以包括:
S501、接收管理服务消费者的第四设备的测试请求。
其中,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略。具体地。
具体地,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项
其中,是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式的具体内容参照上述训练过程的实施例,在此不再赘述。
在本公开实施例中,测试数据也可以是第三设备从数据源中获取的测试数据集,将第四设备发送的与该测试数据集具有相同特征分布的数据作为融合模型的测试数据。
S502,根据测试请求,获取请求测试的融合模型。
其中,测试请求中包括:融合模型的融合标识,进而可以根据测试请求确定待测试的融合模型,该融合模型可以是采用上述模型训练方法训练得到的。
此外,第三设备根据自身的资源以及训练属性的分析,确定是否满足测试属性的测试要求,若可以进行融合模型的测试,则响应第四设备的测试请求为可以进行测试,若不可以进行融合模型的测试,则响应第四设备的训练请求为无法进行测试。
S503,基于测试指示信息,采用测试数据对融合模型进行测试,得到测试结果。
其中,在测试前,本公开将测试属性配置在第三设备的管理服务接口,以便于对融合模型进行测试。
其中,测试指示信息指示了多个模型的多模型融合算法、融合模型生成策略、融合模型集成方式等,因此可以将测试数据输入融合模型,使融合模型按照测试指示信息对测试数据进行处理,得到测试结果。
具体地,若第三设备可以对融合模型进行测试,则选择相关的测试指示信息,将测试数据输入融合模型,得到融合模型输出的预测结果,采用预设的损失函数计算该预测结果和真值结果的损失值,进而可以确定融合模型的性能。
进一步地,在完成融合模型的测试后,可以生成测试结果,测试结果包括融合模型的性能,例如,召回率、分类模型的接受者工作特征曲线(ROC-AUC)或置信度。
S504,向第四设备发送测试结果。
在本公开实施例中,根据融合算法和策略对集成模型进行测试得到测试结果,通过性能指标以及置信度等评估方法得到模型的评估指标。
进一步地,本公开可以将测试结果表示测试合格的融合模型上线至对应的网元设备进行使用。
综上,本公开能够实现在通信系统中,基于多模型联合训练融合模型, 以及多模型联合测试融合模型,进而采用融合模型处理网络数据,可以显著提高模型的预测和泛化能力,从而能在网络智能化和网络自治系统中获得更好的网络性能。
参考图6,图6为本公开另一实施例提供的模型测试方法的流程示意图,该方法应用于管理服务消费者的第四设备。如图6所示,本实施例的模型测试方法可以包括:
S601、生成测试请求。
其中,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略。
其中,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
S602、向管理服务生产者的第三设备发送测试请求。
其中,测试请求用于指示第三设备获取请求测试的融合模型。
S603、接收第三设备发送的测试结果。
其中,测试结果包括融合模型的性能,测试结果是第三设备基于测试指示信息,采用测试数据对融合模型进行测试得到的。
其中,S601至S603的实现原理和技术效果可参照前述实施例,不再赘述。
本公开实施例提供了一种模型训练装置,本实施例的模型训练装置可以为第一设备。如图7所示,模型训练装置可以包括收发机701、处理器702和存储器703。
收发机701,用于在处理器702的控制下接收和发送数据。
其中,在图7中,总线架构可以包括任意数量的互联的总线和桥,具体由处理器702代表的一个或多个处理器和存储器703代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口提供接口。收发机701可以是多个元件,即包括发送机和接收机,提供用于在传输介质上与各种其他装置通信的单元,这些传输介质包括,这些传输介质包括无线信道、有线信道、光缆等传输介质。可选的,模型训练装置还可以包括用户接口704,针对不同的用户设备,用户接口704还可以是能够外接内接需要设备的接口,连接的设备包括但不 限于小键盘、显示器、扬声器、麦克风、操纵杆等。
处理器702负责管理总线架构和通常的处理,存储器703可以存储处理器702在执行操作时所使用的数据。
可选的,处理器702可以是中央处理器(central processing unit,CPU)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD),处理器702也可以采用多核架构。
处理器702通过调用存储器703存储的计算机程序,用于按照获得的可执行指令执行本公开实施例提供的有关第一设备的任一方法。处理器与存储器也可以物理上分开布置。
具体的,处理器702用于执行如下操作:接收管理服务消费者的第二设备的训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;在第一设备中配置训练属性,并根据训练指示信息和多个模型标识,对多个模型进行联合训练,得到训练结果,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的,训练结果包括训练完成的融合模型。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
在此需要说明的是,本公开提供的上述装置,能够实现上述方法实施例中第一设备所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
在网络侧,本公开实施例提供了一种模型测试装置,本实施例的模型训练装置可以为第三设备。如图8所示,模型测试装置可以包括收发机801、处 理器802和存储器803。
收发机801,用于在处理器802的控制下接收和发送数据。
其中,在图8中,总线架构可以包括任意数量的互联的总线和桥,具体由处理器802代表的一个或多个处理器和存储器803代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口提供接口。收发机801可以是多个元件,即包括发送机和接收机,提供用于在传输介质上与各种其他装置通信的单元,这些传输介质包括无线信道、有线信道、光缆等传输介质。处理器802负责管理总线架构和通常的处理,存储器803可以存储处理器802在执行操作时所使用的数据。
处理器802可以是CPU、ASIC、FPGA或CPLD,处理器也可以采用多核架构。
处理器802通过调用存储器803存储的计算机程序,用于按照获得的可执行指令执行本公开实施例提供的有关网络设备的任一方法。处理器与存储器也可以物理上分开布置。
具体的,处理器802用于执行如下操作:接收管理服务消费者的第四设备的测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;根据测试请求,获取请求测试的融合模型;基于测试指示信息,采用测试数据对融合模型进行测试,得到测试结果,测试结果包括融合模型的性能;向第四设备发送测试结果。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
在此需要说明的是,本公开提供的上述装置,能够实现上述方法实施例中第三设备所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
本公开实施例提供了一种模型训练装置,本实施例的模型训练装置可以为第二设备。如图9所示,模型训练装置可以包括:生成单元901和发送单元902。
生成单元901,用于生成训练请求,训练请求携带有训练属性,训练属性包括:训练指示信息,训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,多个模型标识是包含在训练属性中的或者多个模型标识是第一设备生成的;
发送单元902,用于向管理服务生产者的第一设备发送训练请求。
可选的,训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
可选的,若训练指示信息包括融合模型生成策略,融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
可选的,若训练指示信息包括融合模型集成方式,融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
在此需要说明的是,本公开提供的上述装置,能够实现上述方法实施例中第二设备所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
本公开实施例还提供了一种模型测试装置,本实施例的模型测试装置可以为第四设备。如图10所示,模型训练装置包括:生成单元1001、发送单元1002和接收单元1003。
生成单元1001,用于生成测试请求,测试请求携带有测试属性,测试属性包括:测试数据和测试指示信息,测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;
发送单元1002,用于向管理服务生产者的第三设备发送测试请求,测试请求用于指示第三设备获取请求测试的融合模型;
接收单元1003,用于接收第三设备发送的测试结果,测试结果包括融合模型的性能,测试结果是第三设备基于测试指示信息,采用测试数据对融合模型进行测试得到的。
可选的,测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
在此需要说明的是,本公开提供的上述装置,能够实现上述方法实施例 中第四设备所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
需要说明的是,本公开实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本公开各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本公开实施例提供了一种处理器可读存储介质,处理器可读存储介质存储有计算机程序,计算机程序用于使处理器执行本公开实施例提供的有关第一设备至第四设备的任一方法。使处理器能够实现上述方法实施例中第一设备至第四设备所实现的所有方法步骤,且能够达到相同的技术效果,在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。
处理器可读存储介质可以是处理器能够存取的任何可用介质或数据存储设备,包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(MO)等)、光学存储器(例如CD、DVD、BD、HVD等)、以及半导体存储器(例如ROM、EPROM、EEPROM、非易失性存储器(NAND FLASH)、固态硬盘(SSD))等。
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、装置、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机可执行指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机可执行指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些处理器可执行指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的处理器可读存储器中,使得存储在该处理器可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些处理器可执行指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本公开进行各种改动和变型而不脱离本公开的精神和范围。这样,倘若本公开的这些修改和变型属于本公开权利要求及其等同技术的范围之内,则本公开也意图包含这些改动和变型在内。

Claims (25)

  1. 一种模型训练方法,其特征在于,应用于管理服务生产者的第一设备,所述模型训练方法包括:
    接收管理服务消费者的第二设备的训练请求,所述训练请求携带有训练属性,所述训练属性包括:训练指示信息,所述训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;
    在所述第一设备中配置所述训练属性,并根据所述训练指示信息和多个模型标识,对所述多个模型进行联合训练,得到训练结果,所述多个模型标识是包含在所述训练属性中的或者所述多个模型标识是所述第一设备生成的,所述训练结果包括训练完成的融合模型。
  2. 根据权利要求1所述的模型训练方法,其特征在于,所述训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  3. 根据权利要求2所述的模型训练方法,其特征在于,若所述训练指示信息包括融合模型生成策略,所述融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
  4. 根据权利要求2所述的模型训练方法,其特征在于,若所述训练指示信息包括融合模型集成方式,所述融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
  5. 一种模型测试方法,其特征在于,应用于管理服务生产者的第三设备,所述模型测试方法包括:
    接收管理服务消费者的第四设备的测试请求,所述测试请求携带有测试属性,所述测试属性包括:测试数据和测试指示信息,所述测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;
    根据所述测试请求,获取请求测试的融合模型;
    基于所述测试指示信息,采用所述测试数据对所述融合模型进行测试,得到测试结果,所述测试结果包括所述融合模型的性能;
    向所述第四设备发送所述测试结果。
  6. 根据权利要求5所述的模型测试方法,其特征在于,所述测试指示 信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  7. 一种模型训练方法,其特征在于,应用于管理服务消费者的第二设备,包括:
    生成训练请求,所述训练请求携带有训练属性,所述训练属性包括:训练指示信息,所述训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,所述多个模型标识是包含在所述训练属性中的或者所述多个模型标识是管理服务生产者的第一设备生成的;
    向管理服务生产者的第一设备发送所述训练请求。
  8. 根据权利要求7所述的模型训练方法,其特征在于,所述训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  9. 根据权利要求7所述的模型训练方法,其特征在于,若所述训练指示信息包括融合模型生成策略,所述融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
  10. 根据权利要求7所述的模型训练方法,其特征在于,若所述训练指示信息包括融合模型集成方式,所述融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
  11. 一种模型测试方法,其特征在于,应用于管理服务消费者的第四设备,所述模型测试方法包括:
    生成测试请求,所述测试请求携带有测试属性,所述测试属性包括:测试数据和测试指示信息,所述测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;
    向管理服务生产者的第三设备发送测试请求,所述测试请求用于指示所述第三设备获取请求测试的融合模型;
    接收所述第三设备发送的测试结果,所述测试结果包括所述融合模型的性能,所述测试结果是所述第三设备基于所述测试指示信息,采用所述测试数据对所述融合模型进行测试得到的。
  12. 根据权利要求11所述的模型测试方法,其特征在于,所述测试指 示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  13. 一种模型训练装置,其特征在于,应用于管理服务生产者的第一设备,包括存储器、收发机和处理器:
    所述存储器,用于存储计算机程序;
    所述收发机,用于在所述处理器的控制下收发数据;
    所述处理器,用于读取所述存储器中的计算机程序并执行如下操作:
    接收管理服务消费者的第二设备的训练请求,所述训练请求携带有训练属性,所述训练属性包括:训练指示信息,所述训练指示信息用于指示多个模型之间的关联关系以及模型生成策略;
    在所述第一设备中配置所述训练属性,并根据所述训练指示信息和多个模型标识,对所述多个模型进行联合训练,得到训练结果,所述多个模型标识是包含在所述训练属性中的或者所述多个模型标识是所述第一设备生成的,所述训练结果包括训练完成的融合模型。
  14. 根据权利要求13所述的模型训练装置,其特征在于,所述训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  15. 根据权利要求14所述的模型训练装置,其特征在于,若所述训练指示信息包括融合模型生成策略,所述融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
  16. 根据权利要求14所述的模型训练装置,其特征在于,若所述训练指示信息包括融合模型集成方式,所述融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
  17. 一种模型测试装置,其特征在于,应用于管理服务生产者的第三设备,包括存储器、收发机和处理器:
    所述存储器,用于存储计算机程序;
    所述收发机,用于在所述处理器的控制下收发数据;
    所述处理器,用于读取所述存储器中的计算机程序并执行如下操作:
    接收管理服务消费者的第四设备的测试请求,所述测试请求携带有测试属性,所述测试属性包括:测试数据和测试指示信息,所述测试指示信 息用于指示多个模型之间的关联关系以及模型生成策略;
    根据所述测试请求,获取请求测试的融合模型;
    基于所述测试指示信息,采用所述测试数据对所述融合模型进行测试,得到测试结果,所述测试结果包括所述融合模型的性能;
    向所述第四设备发送所述测试结果。
  18. 根据权利要求17所述的模型测试装置,其特征在于,所述测试指示信息包括:用于指示是否测试融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  19. 一种模型训练装置,其特征在于,应用于管理服务消费者的第二设备,包括:
    生成单元,用于生成训练请求,所述训练请求携带有训练属性,所述训练属性包括:训练指示信息,所述训练指示信息用于指示多个模型标识对应的多个模型训练时,多个模型之间的关联关系以及模型生成策略,所述多个模型标识是包含在所述训练属性中的或者所述多个模型标识是管理服务生产者的第一设备生成的;
    发送单元,用于向第一设备发送所述训练请求。
  20. 根据权利要求19所述的模型训练装置,其特征在于,所述训练指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  21. 根据权利要求20所述的模型训练装置,其特征在于,若所述训练指示信息包括融合模型生成策略,所述融合模型生成策略包括:模型初始化方式、模型训练轮数、模型对应的损失函数以及模型的网络结构中的至少一项。
  22. 根据权利要求20所述的模型训练装置,其特征在于,若所述训练指示信息包括融合模型集成方式,所述融合模型集成方式包括:加权或直接平均法、投票法以及学习法中至少一项。
  23. 一种模型测试装置,其特征在于,应用于管理服务消费者的第四设备,包括:
    生成单元,用于生成测试请求,所述测试请求携带有测试属性,所述测试属性包括:测试数据和测试指示信息,所述测试指示信息用于指示多个模型之间的关联关系以及模型生成策略;
    发送单元,用于向管理服务生产者的第三设备发送测试请求,所述测试请求用于指示所述第三设备获取请求测试的融合模型;
    接收单元,用于接收所述第三设备发送的测试结果,所述测试结果包括所述融合模型的性能,所述测试结果是所述第三设备基于所述测试指示信息,采用所述测试数据对所述融合模型进行测试得到的。
  24. 根据权利要求23所述的模型测试装置,其特征在于,所述测试指示信息包括:用于指示是否训练融合模型、多个模型之间的关联信息、多模型融合算法、融合模型生成策略、融合模型集成方式中的至少一项。
  25. 一种处理器可读存储介质,其特征在于,所述处理器可读存储介质存储有计算机程序,所述计算机程序用于使所述处理器执行权利要求1-4、7-10中任一项所述的模型训练方法或5-6、11-12中任一项所述的模型测试方法。
PCT/CN2023/120161 2022-11-04 2023-09-20 模型训练方法、模型测试方法、装置及存储介质 WO2024093561A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211378359.3A CN118036777A (zh) 2022-11-04 2022-11-04 模型训练方法、模型测试方法、装置及存储介质
CN202211378359.3 2022-11-04

Publications (1)

Publication Number Publication Date
WO2024093561A1 true WO2024093561A1 (zh) 2024-05-10

Family

ID=90929670

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/120161 WO2024093561A1 (zh) 2022-11-04 2023-09-20 模型训练方法、模型测试方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN118036777A (zh)
WO (1) WO2024093561A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469366A (zh) * 2020-03-31 2021-10-01 北京观成科技有限公司 一种加密流量的识别方法、装置及设备
CN113822322A (zh) * 2021-07-15 2021-12-21 腾讯科技(深圳)有限公司 图像处理模型训练方法及文本处理模型训练方法
CN114118192A (zh) * 2020-09-01 2022-03-01 中国移动通信有限公司研究院 用户预测模型的训练方法、预测方法、装置及存储介质
CN114912705A (zh) * 2022-06-01 2022-08-16 南京理工大学 一种联邦学习中异质模型融合的优化方法
US20220335711A1 (en) * 2021-07-29 2022-10-20 Beijing Baidu Netcom Science Technology Co., Ltd. Method for generating pre-trained model, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469366A (zh) * 2020-03-31 2021-10-01 北京观成科技有限公司 一种加密流量的识别方法、装置及设备
CN114118192A (zh) * 2020-09-01 2022-03-01 中国移动通信有限公司研究院 用户预测模型的训练方法、预测方法、装置及存储介质
CN113822322A (zh) * 2021-07-15 2021-12-21 腾讯科技(深圳)有限公司 图像处理模型训练方法及文本处理模型训练方法
US20220335711A1 (en) * 2021-07-29 2022-10-20 Beijing Baidu Netcom Science Technology Co., Ltd. Method for generating pre-trained model, electronic device and storage medium
CN114912705A (zh) * 2022-06-01 2022-08-16 南京理工大学 一种联邦学习中异质模型融合的优化方法

Also Published As

Publication number Publication date
CN118036777A (zh) 2024-05-14

Similar Documents

Publication Publication Date Title
EP4160995A1 (en) Data processing method and device
US20230262793A1 (en) Method for communication between user terminal and network, and terminal, network device and apparatus
CN113923694A (zh) 网络资源编排方法、系统、装置及存储介质
Sun et al. Intelligent ran automation for 5g and beyond
WO2024093561A1 (zh) 模型训练方法、模型测试方法、装置及存储介质
US20240129199A1 (en) Data analysis method and apparatus, electronic device, and storage medium
WO2024067098A1 (zh) 模型信息上报方法、设备、装置及存储介质
WO2024046092A1 (zh) 目标小区的选择方法、配置方法、信息传输方法及设备
WO2024099243A1 (zh) 模型监测方法、装置、终端及网络侧设备
WO2022148164A1 (zh) 一种信息传输方法、装置及通信设备
WO2024066983A1 (zh) 信道状态信息的上报方法、终端及网络设备
WO2024082839A1 (zh) 一种信息传输方法、装置及设备
WO2024087981A1 (zh) 数据传输调度方法及装置
WO2024022441A1 (zh) 体验质量QoE的配置方法及装置
WO2024120194A1 (zh) 算网融合方法、装置及存储介质
WO2024139880A1 (zh) 一种信息处理方法、设备及可读存储介质
WO2024022442A1 (zh) 一种体验质量报告的传输方法、装置、终端、sn及mn
WO2024114060A1 (zh) 一种信息处理方法、装置及设备
WO2024120277A1 (zh) 节能控制方法、装置及存储介质
WO2023202323A1 (zh) 一种信息处理方法、装置及设备
WO2024067398A1 (zh) 紧急业务的处理方法及装置
WO2024099064A1 (zh) 用户面路径构建方法、装置、网络节点
WO2024139917A1 (zh) 数据传输方法、装置、终端及网络设备
WO2024114246A1 (zh) 分布式网络的会话锚点的确定方法、设备及装置
WO2024041659A1 (zh) 测量上报方法、装置、终端及网络设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23884493

Country of ref document: EP

Kind code of ref document: A1