CN115442216B

CN115442216B - Network slicing fault self-healing method, device, equipment and computer storage medium

Info

Publication number: CN115442216B
Application number: CN202110628791.2A
Authority: CN
Inventors: 邢彪; 丁东; 冯杭生; 陈嫦娇; 陈向荣
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Group Zhejiang Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Group Zhejiang Co Ltd
Priority date: 2021-06-04
Filing date: 2021-06-04
Publication date: 2023-09-05
Anticipated expiration: 2041-06-04
Also published as: CN115442216A

Abstract

The invention discloses a network slice fault self-healing method, a device, equipment and a computer program product, wherein the method comprises the following steps: acquiring multi-dimensional index data of a network slice, and detecting the multi-dimensional index data; when abnormal data are detected, generating a slice self-healing action and a slice fault action according to the multidimensional index data and judging the slice self-healing action and the slice fault action to determine a behavior score of the generated slice self-healing action and the slice fault action; and carrying out iterative optimization on the generated self-healing actions of the slice and the fault actions of the slice according to the behavior scores so as to determine a target self-healing action capable of self-healing the fault of the network slice and carry out fault repair on the network slice. The invention respectively generates the slice self-healing action and the slice fault action to form countermeasures by the double intelligent agents, forces the generated slice self-healing action to continuously improve the fault repairing capability of the slice self-healing action, improves the performance of the fault self-healing strategy, and further improves the fault self-healing recovering capability of the network slice.

Description

Network slicing fault self-healing method, device, equipment and computer storage medium

技术领域technical field

本发明涉及通信技术领域，尤其涉及一种网络切片故障自愈方法、装置、设备及计算机程序产品。The present invention relates to the field of communication technology, in particular to a network slice fault self-healing method, device, equipment and computer program product.

背景技术Background technique

现有的网络切片故障自愈方法需要依靠技术人员的经验，人工设置自愈策略，自愈策略主要是对各项指标数据简单设置阈值实现切片故障自愈动作的触发，自愈策略依赖人工设置规则来实现时，故障的自愈效率较低、容易出错，当网络切片发生变化时也无法及时调整策略。且现有的切片故障自愈流程中，首先需要分析故障类型和故障影响范围，再确定合适的故障处理流程，并基于人为定义的策略实现故障修复或故障自愈，自愈过程流程较长，较多的时间花在故障分析和定位上，导致业务影响时间较长，从而影响用户体验，因此，现有的依赖人工设置规则的网络切片故障自愈策略的故障自愈性能不佳。Existing self-healing methods for network slicing faults need to rely on the experience of technicians and manually set self-healing strategies. The self-healing strategy is mainly to simply set thresholds for various index data to trigger slice fault self-healing actions. The self-healing strategy relies on manual settings. When the rules are implemented, the self-healing efficiency of faults is low and error-prone, and the policy cannot be adjusted in time when the network slice changes. Moreover, in the existing slice fault self-healing process, it is first necessary to analyze the fault type and fault impact range, then determine the appropriate fault handling process, and implement fault repair or fault self-healing based on artificially defined strategies. The self-healing process is relatively long. More time is spent on fault analysis and location, resulting in a longer service impact time, which affects user experience. Therefore, the existing network slicing fault self-healing strategy that relies on manually setting rules has poor fault self-healing performance.

发明内容Contents of the invention

本发明的主要目的在于提供一种网络切片故障自愈方法、装置、设备及计算机程序产品，旨在解决现有的依赖人工设置规则的网络切片故障自愈策略故障自愈性能不佳的技术问题。The main purpose of the present invention is to provide a network slicing fault self-healing method, device, equipment and computer program product, aiming to solve the existing technical problem of poor fault self-healing performance of the existing network slicing fault self-healing strategy relying on manual setting rules .

此外，为实现上述目的，本发明还提供一种网络切片故障自愈方法，所述网络切片故障自愈方法包括以下步骤：In addition, in order to achieve the above object, the present invention also provides a network slice fault self-healing method, the network slice fault self-healing method includes the following steps:

获取网络切片的多维指标数据，并对获取的多维指标数据进行检测；Obtain the multi-dimensional index data of the network slice, and detect the obtained multi-dimensional index data;

当检测到所述网络切片的多维指标数据中存在异常指标数据时，根据所述异常指标数据生成切片自愈动作和切片故障动作；When abnormal index data is detected in the multi-dimensional index data of the network slice, a slice self-healing action and a slice fault action are generated according to the abnormal index data;

对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；evaluating the generated slice self-healing actions and slice failure actions to determine the behavior scores of the slice self-healing actions and the slice failure actions;

根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。Iterative optimization is performed on the generated slice self-healing action and slice fault action according to the behavior score to determine a target self-healing action, and the target self-healing action is used to repair the fault of the network slice.

可选地，所述根据所述多维指标数据生成切片自愈动作和切片故障动作的步骤，包括：Optionally, the step of generating a slice self-healing action and a slice fault action according to the multidimensional index data includes:

将所述多维指标数据输入预设的目标自愈模型，其中，所述目标自愈模型是利用网络切片的历史多维指标数据对预设的待训练自愈模型进行迭代训练得到的，所述目标自愈模型包括动作生成器；Input the multi-dimensional index data into the preset target self-healing model, wherein the target self-healing model is obtained by iteratively training the preset self-healing model to be trained by using the historical multi-dimensional index data of network slices, and the target Self-healing models include action generators;

根据所述多维指标数据，利用所述目标自愈模型的动作生成器生成切片自愈动作和切片故障动作。According to the multi-dimensional index data, an action generator of the target self-healing model is used to generate a slice self-healing action and a slice fault action.

可选地，所述动作生成器包括自愈动作生成器和故障动作生成器，所述根据所述多维指标数据，利用所述目标自愈模型的动作生成器生成切片自愈动作和切片故障动作的步骤，包括：Optionally, the action generator includes a self-healing action generator and a fault action generator, and the action generator using the target self-healing model generates a slice self-healing action and a slice fault action according to the multidimensional index data steps, including:

根据所述多维指标数据确定所述网络切片当前的全信息状态，其中，所述全信息状态包括所述自愈动作生成器可观测的第一状态，以及所述故障动作生成器可观测的第二状态；Determine the current full information state of the network slice according to the multidimensional index data, wherein the full information state includes a first state observable by the self-healing action generator, and a first state observable by the fault action generator Two states;

将所述全信息状态中的第一状态输入至所述自愈动作生成器中，利用所述自愈动作生成器生成切片自愈动作；inputting the first state in the full information state into the self-healing action generator, and using the self-healing action generator to generate slice self-healing actions;

将所述全信息状态中的第二状态输入至所述故障动作生成器中，利用所述故障动作生成器生成切片故障动作。The second state in the full information state is input into the fault action generator, and the slice fault action is generated by the fault action generator.

可选地，所述目标自愈模型还包括自愈动作评判器和故障动作评判器，所述对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分的步骤，包括：Optionally, the target self-healing model further includes a self-healing action evaluator and a fault action evaluator, which evaluates the generated slice self-healing action and slice fault action to determine the slice self-healing action and the The steps for slicing the behavioral score of fault actions include:

将所述全信息状态中的第一状态、所述切片自愈动作和所述切片故障动作输入至所述目标自愈模型中的自愈动作评判器中，以对所述切片自愈动作进行评判，确定所述切片自愈动作的行为得分；input the first state in the full information state, the slice self-healing action and the slice fault action into the self-healing action evaluator in the target self-healing model, so as to evaluate the slice self-healing action Judging, determining the behavior score of the self-healing action of the slice;

根据所述切片故障动作确定故障破坏半径；determining the fault damage radius according to the slice fault action;

将所述全信息状态中的第二状态、所述故障破坏半径、所述切片自愈动作和所述切片故障动作输入至所述目标自愈模型中的故障动作评判器中，以对所述切片故障动作进行评判，确定所述切片故障动作的行为得分。inputting the second state in the full information state, the fault damage radius, the slice self-healing action and the slice fault action into the fault action evaluator in the target self-healing model, so as to evaluate the Slice fault actions are judged to determine the behavior score of the slice fault actions.

可选地，所述根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复的步骤，包括：Optionally, performing iterative optimization on the generated slice self-healing action and slice fault action according to the behavior score to determine a target self-healing action, and using the target self-healing action to perform fault repair on the network slice steps, including:

将所述切片自愈动作的行为得分反馈至所述自愈动作生成器，并将所述切片故障动作的行为得分反馈至所述故障动作生成器，返回并执行所述利用所述目标自愈模型中的自愈动作生成器生成切片自愈动作，并利用所述目标自愈模型中的故障动作生成器生成切片故障动作的步骤，以对所述切片自愈动作和所述切片故障动作进行迭代优化，直到所述切片自愈动作的行为得分和所述切片故障动作的行为得分满足预设条件，得到目标自愈动作；Feedback the behavior score of the slice self-healing action to the self-healing action generator, and feed back the behavior score of the slice fault action to the fault action generator, return and execute the self-healing action using the target The self-healing action generator in the model generates a slice self-healing action, and uses the fault action generator in the target self-healing model to generate a slice fault action, so as to perform the slice self-healing action and the slice fault action Iterative optimization, until the behavior score of the slice self-healing action and the behavior score of the slice failure action meet the preset conditions, and obtain the target self-healing action;

根据所述目标自愈动作对所述网络切片进行状态切换，以将所述网络切片从当前的全信息状态切换至所述目标自愈动作对应的目标全信息状态，对所述网络切片进行故障修复。Switch the state of the network slice according to the target self-healing action, so as to switch the network slice from the current full information state to the target full information state corresponding to the target self-healing action, and perform fault on the network slice repair.

可选地，所述根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复的步骤之后，包括：Optionally, performing iterative optimization on the generated slice self-healing action and slice fault action according to the behavior score to determine a target self-healing action, and using the target self-healing action to perform fault repair on the network slice After the steps, include:

将所述目标自愈动作输入预设的奖励函数，得到所述目标自愈动作对应的奖励值；Inputting the target self-healing action into a preset reward function to obtain a reward value corresponding to the target self-healing action;

根据所述奖励值生成经验回放数据集合，并根据所述经验回放数据集合对所述目标自愈模型的模型参数进行更新。An experience replay data set is generated according to the reward value, and model parameters of the target self-healing model are updated according to the experience replay data set.

可选地，所述根据所述目标指标数据生成切片自愈动作和切片故障动作的步骤之前，还包括：Optionally, before the step of generating a slice self-healing action and a slice fault action according to the target index data, it may further include:

获取网络切片的历史指标数据并进行预处理，得到样本数据；Obtain the historical indicator data of the network slice and perform preprocessing to obtain sample data;

获取模型架构参数，并根据所述模型架构参数建立基础自愈模型；Obtain model architecture parameters, and establish a basic self-healing model according to the model architecture parameters;

利用所述样本数据对所述基础自愈模型进行迭代训练，得到目标自愈模型。Using the sample data to iteratively train the basic self-healing model to obtain a target self-healing model.

此外，为实现上述目的，本发明还提供一种网络切片故障自愈装置，所述网络切片故障自愈装置包括：In addition, in order to achieve the above purpose, the present invention also provides a network slice fault self-healing device, the network slice fault self-healing device includes:

数据检测模块，用于获取网络切片的多维指标数据，并对获取的多维指标数据进行检；The data detection module is used to obtain the multi-dimensional index data of the network slice, and inspect the obtained multi-dimensional index data;

动作生成模块，用于当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；An action generation module, configured to generate slice self-healing actions and slice fault actions according to the multi-dimensional index data when abnormal data is detected in the multi-dimensional index data of the network slice, and to generate slice self-healing actions and slice fault actions Actions are judged to determine the behavior scores of the slice self-healing action and the slice failure action;

双智能体对抗模块，用于根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。The dual-agent confrontation module is used to iteratively optimize the generated slice self-healing action and slice fault action according to the behavior score, so as to determine the target self-healing action, and use the target self-healing action to perform fault on the network slice repair.

此外，为实现上述目的，本发明还提供一种网络切片故障自愈设备，所述网络切片故障自愈设备包括：存储器、处理器及存储在所述存储器上并可在所述处理器上运行的网络切片故障自愈程序，所述网络切片故障自愈程序被所述处理器执行时实现如上述的网络切片故障自愈方法的步骤。In addition, in order to achieve the above object, the present invention also provides a network slice fault self-healing device, the network slice fault self-healing device includes: a memory, a processor, and an A network slice fault self-healing program, when the network slice fault self-healing program is executed by the processor, the steps of the above-mentioned network slice fault self-healing method are implemented.

此外，为实现上述目的，本发明还提供一种计算机程序产品，所述计算机程序产品包括计算机程序，所述计算机程序被处理器执行时实现如上述的网络切片故障自愈方法的步骤。In addition, in order to achieve the above object, the present invention also provides a computer program product, the computer program product includes a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned network slice fault self-healing method are implemented.

本发明实施例提出的一种网络切片故障自愈方法、装置、设备及计算机程序产品。与现有的网络切片故障自愈方法中的自愈策略故障自愈性能不佳相比，本发明实施例中，通过获取网络切片的多维指标数据，并对获取的多维指标数据进行检测；当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。当检测到网络切片的多为指标数据存在异常时，通过生成切片故障动作主动制造故障，并与生成的切片自愈动作进行对抗，迫使生成的切片自愈动作提高自身的故障修复能力，进而提高了网络切片自愈策略的故障自愈性能，能够及时识别并修复网络切片中的故障问题，在网络切片故障造成中断之前使故障自愈恢复，避免造成严重后果，提高了网络切片的自愈恢复能力。A network slicing fault self-healing method, device, equipment and computer program product proposed by the embodiments of the present invention. Compared with the poor self-healing performance of the self-healing strategy in the existing network slicing fault self-healing method, in the embodiment of the present invention, by obtaining the multi-dimensional index data of the network slice, and detecting the acquired multi-dimensional index data; when When abnormal data is detected in the multi-dimensional index data of the network slice, generate slice self-healing actions and slice fault actions according to the multi-dimensional index data, and evaluate the generated slice self-healing actions and slice fault actions to determine the The behavior scores of the slice self-healing action and the slice fault action; iteratively optimize the generated slice self-healing action and slice fault action according to the behavior score to determine the target self-healing action, and use the target self-healing action Perform fault repair on the network slice. When it is detected that there is an anomaly in most of the index data of the network slice, it will actively create a fault by generating a slice fault action, and fight against the generated slice self-healing action, forcing the generated slice self-healing action to improve its own fault repair ability, thereby improving The fault self-healing performance of the network slicing self-healing strategy is improved, and the fault problem in the network slicing can be identified and repaired in time, and the fault self-healing can be recovered before the network slicing fault causes interruption, avoiding serious consequences, and improving the self-healing recovery of the network slicing ability.

附图说明Description of drawings

图1为本发明实施例提供的网络切片故障自愈设备一种实施方式的硬件结构示意图；FIG. 1 is a schematic diagram of the hardware structure of an implementation of a network slicing fault self-healing device provided by an embodiment of the present invention;

图2为本发明网络切片故障自愈方法第一实施例的流程示意图；FIG. 2 is a schematic flowchart of a first embodiment of a network slice fault self-healing method according to the present invention;

图3为本发明网络切片故障自愈方法第二实施例中动作生成器的对抗过程示意图；3 is a schematic diagram of the confrontation process of the action generator in the second embodiment of the network slice fault self-healing method of the present invention;

图4为本发明网络切片故障自愈方法第三实施例中动作生成器和动作评判器的神经网络层级架构示意图；4 is a schematic diagram of the neural network hierarchical architecture of the action generator and the action evaluator in the third embodiment of the network slice fault self-healing method of the present invention;

图5为本发明网络切片故障自愈装置一实施例的功能模块示意图。FIG. 5 is a schematic diagram of functional modules of an embodiment of a network slice fault self-healing device according to the present invention.

本发明目的的实现、功能特点及优点将结合实施例，参照附图做进一步说明。The realization of the purpose of the present invention, functional characteristics and advantages will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

具体实施方式Detailed ways

应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

在后续的描述中，使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本发明的说明，其本身没有特定的意义。因此，“模块”、“部件”或“单元”可以混合地使用。In the following description, use of suffixes such as 'module', 'part' or 'unit' for denoting elements is only for facilitating description of the present invention and has no specific meaning by itself. Therefore, 'module', 'part' or 'unit' may be used in combination.

本发明实施例的主要解决方案：通过获取网络切片的多维指标数据，并对获取的多维指标数据进行检测；当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。通过主动制造故障，使生成切片自愈动作和生成切片故障动作的双智能体进行对抗，提高网络切片自愈策略的故障自愈性能，及时识别并修复网络切片中的故障问题，在网络切片故障造成中断之前使故障自愈恢复，避免造成严重后果，进而提高了网络切片的自愈恢复能力。The main solution of the embodiment of the present invention: by acquiring the multi-dimensional index data of the network slice, and detecting the obtained multi-dimensional index data; when abnormal data is detected in the multi-dimensional index data of the network slice, according to the multi-dimensional index data Generate slice self-healing actions and slice fault actions from the data, and evaluate the generated slice self-healing actions and slice fault actions to determine the behavior scores of the slice self-healing actions and the slice fault actions; The generated slice self-healing action and slice fault action are iteratively optimized to determine a target self-healing action, and use the target self-healing action to perform fault repair on the network slice. By actively creating faults, the dual agents that generate slice self-healing actions and slice fault actions can fight against each other, improve the fault self-healing performance of the network slice self-healing strategy, identify and repair faults in network slices in a timely manner, and prevent network slice failures Self-healing and recovery of faults can be made before interruptions are caused to avoid serious consequences, thereby improving the self-healing and recovery capabilities of network slicing.

本发明实施例涉及的主要技术术语：The main technical terms involved in the embodiments of the present invention:

网络切片(networkslice，NS)：网络切片是指在物理或者虚拟的网络基础设施上，根据不同的服务需求定制化不同的逻辑网络。网络切片可以是一个包括了终端设备、接入网、传输网、核心网和应用服务器的完整的端到端网络，能够提供完整的通信服务，具有一定网络能力。网络切片也可以是终端设备、接入网、传输网、核心网和应用服务器的任意组合。Network slicing (networkslice, NS): Network slicing refers to the customization of different logical networks according to different service requirements on physical or virtual network infrastructure. Network slicing can be a complete end-to-end network including terminal equipment, access network, transmission network, core network and application server, which can provide complete communication services and have certain network capabilities. Network slices can also be any combination of terminal devices, access networks, transmission networks, core networks, and application servers.

混沌工程(Chaos Engineering)：混沌工程是一种可以确保系统的可用性、提高技术架构弹性能力的技术手段，旨在将故障扼杀在襁褓之中，也即，在故障造成中断之前将故障识别出来，通过主动制造故障，测试系统在各种压力下的行为，主动识别并修复故障问题，避免造成严重后果。Chaos Engineering: Chaos Engineering is a technical means that can ensure the availability of the system and improve the resilience of the technical architecture. It aims to kill the failure in the infancy, that is, to identify the failure before the failure causes interruption, By actively creating faults, testing the behavior of the system under various pressures, proactively identifying and repairing faults to avoid serious consequences.

NSMF：网络切片管理功能(Network Slice Management Function)，负责接收网络切片需求，对网络切片的生命周期、性能、故障等进行管理，编排网络切片的组成，将网络切片的需求分解为各网络切片子网或网络功能的需求，并向各NSSMF发送网络切片子网管理请求。NSMF: Network Slice Management Function (Network Slice Management Function), responsible for receiving network slice requirements, managing the life cycle, performance, faults, etc. of network slices, arranging the composition of network slices, and decomposing network slice requirements into each network slice Network or network function requirements, and send a network slice subnet management request to each NSSMF.

NSSMF：网络切片子网管理功能(Network Slice Subnet Management Function)，接收从NSMF下发的网络切片子网部署需求，对网络切片子网进行管理，编排网络切片子网的组成，将网络切片子网的SLA(Service Level Agreement，服务级别协议)需求映射为网络服务的QoS(Quality of Service，服务质量)需求，并下发网络服务的部署请求。NSSMF: Network Slice Subnet Management Function (Network Slice Subnet Management Function), receives the network slice subnet deployment requirements issued by NSMF, manages the network slice subnet, arranges the composition of the network slice subnet, and divides the network slice subnet The SLA (Service Level Agreement, service level agreement) requirements of the network service are mapped to the QoS (Quality of Service, service quality) requirements of the network service, and the deployment request of the network service is issued.

本发明实施例考虑到，现有相关方案中，网络切片故障自愈策略依赖于人工设置的规则，自愈恢复流程耗时长且容易出错，当切片故障发生变化时，较多的时间用于故障分析和定位，无法及时调整自愈策略。为解决上述问题，在网络切片的自愈策略中引入了混沌工程，通过主动制造故障，测试系统在各种压力下的行为，主动识别并修复故障问题，避免造成严重后果。但现有的基于混沌工程的网络切片故障自愈方法，主要依赖于人工制造故障，制造故障的规律容易被自愈系统学习，导致自愈策略的泛化能力和自愈恢复能力下降。因此，现有的网络切片故障自愈策略普遍存在故障自愈性能不佳的问题。The embodiment of the present invention considers that in the existing related schemes, the network slice fault self-healing strategy relies on manually set rules, and the self-healing restoration process takes a long time and is prone to errors. When a slice fault changes, more time is spent on the fault Analysis and positioning, unable to adjust self-healing strategies in time. In order to solve the above problems, chaos engineering is introduced in the self-healing strategy of network slicing. By actively creating faults, testing the behavior of the system under various pressures, proactively identifying and repairing faults, and avoiding serious consequences. However, the existing network slicing fault self-healing methods based on chaos engineering mainly rely on artificial faults, and the rules of faults are easily learned by the self-healing system, resulting in a decline in the generalization ability and self-healing recovery ability of the self-healing strategy. Therefore, existing network slicing fault self-healing strategies generally have the problem of poor fault self-healing performance.

因此，本发明实施例提出解决方案，当检测到网络切片的多维指标数据存在异常时，通过生成切片故障动作主动制造故障，并与生成的切片自愈动形成对抗，迫使生成的切片自愈动作不断提高自身的故障修复能力，进而提高了故障自愈策略的性能，在网络切片故障造成中断之前将故障识别出来并进行修复，提高网络切片的故障自愈恢复能力，避免造成严重后果。Therefore, the embodiment of the present invention proposes a solution. When an abnormality is detected in the multi-dimensional index data of the network slice, the fault action is actively created by generating a slice fault action, and it will fight against the self-healing action of the generated slice to force the self-healing action of the generated slice. Continuously improve its own fault repair capability, thereby improving the performance of the fault self-healing strategy, identifying and repairing faults before network slicing faults cause interruptions, improving the fault self-healing and recovery capabilities of network slicing, and avoiding serious consequences.

具体地，参照图1，图1为本发明实施例网络切片故障自愈装置所属终端设备的功能模块示意图，该终端设备(又叫终端或设备)可以是PC，也可以是智能手机、平板电脑和便携计算机等具有数据处理功能的可移动式终端设备。Specifically, referring to FIG. 1 , FIG. 1 is a schematic diagram of functional modules of a terminal device belonging to a network slicing fault self-healing device according to an embodiment of the present invention. The terminal device (also called a terminal or device) may be a PC, a smart phone, or a tablet computer. Mobile terminal equipment with data processing functions such as portable computers.

如图1所示，该终端设备可以包括：处理器1001，例如CPU，网络接口1004，用户接口1003，存储器1005，通信总线1002。其中，通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard)，可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器，也可以是稳定的存储器(non-volatile memory)，例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the terminal device may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

可选地，终端还可以包括摄像头、RF(Radio Frequency，射频)电路，传感器、音频电路、WiFi模块等等。其中，传感器比如光传感器、运动传感器以及其他传感器。具体地，光传感器可包括环境光传感器及接近传感器，其中，环境光传感器可根据环境光线的明暗来调节显示屏的亮度，接近传感器可在移动终端移动到耳边时，关闭显示屏和/或背光。作为运动传感器的一种，重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小，静止时可检测出重力的大小及方向，可用于识别移动终端姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等；当然，移动终端还可配置陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器，在此不再赘述。Optionally, the terminal may further include a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. Among them, sensors such as light sensors, motion sensors and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display screen according to the brightness of the ambient light, and the proximity sensor may turn off the display screen and/or backlight. As a kind of motion sensor, the gravitational acceleration sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be used for applications that recognize the posture of mobile terminals (such as horizontal and vertical screen switching, Related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, tap), etc.; of course, the mobile terminal can also be equipped with other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc. No longer.

本领域技术人员可以理解，图1中示出的终端结构并不构成对终端的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。Those skilled in the art can understand that the terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.

如图1所示，作为一种计算机程序产品的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及网络切片故障自愈程序。As shown in FIG. 1 , the memory 1005 as a computer program product may include an operating system, a network communication module, a user interface module, and a network slice fault self-healing program.

在图1所示的终端中，网络接口1004主要用于连接后台服务器，与后台服务器进行数据通信；用户接口1003主要用于连接客户端(用户端)，与客户端进行数据通信；而处理器1001可以用于调用存储器1005中存储的网络切片故障自愈程序，所述网络切片故障自愈程序被处理器执行时实现下述实施例提供的网络切片故障自愈方法中的操作。In the terminal shown in Figure 1, the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server; the user interface 1003 is mainly used to connect to the client (client) and perform data communication with the client; and the processor 1001 may be used to call a network slice fault self-healing program stored in the memory 1005, and when the network slice fault self-healing program is executed by a processor, operations in the network slice fault self-healing method provided in the following embodiments are implemented.

基于上述设备硬件结构，提出了本发明网络切片故障自愈方法的实施例。Based on the above-mentioned device hardware structure, an embodiment of the network slicing fault self-healing method of the present invention is proposed.

参照图2，在本发明网络切片故障自愈方法的第一实施例中，所述网络切片故障自愈方法包括：Referring to FIG. 2, in the first embodiment of the network slice fault self-healing method of the present invention, the network slice fault self-healing method includes:

步骤S10，获取网络切片的多维指标数据，并对获取的多维指标数据进行检测；Step S10, acquiring multi-dimensional index data of the network slice, and detecting the acquired multi-dimensional index data;

随着通信技术的发展，特别是第五代移动通信(the 5th-generation，5G)技术的普及和应用，引入了网络切片的概念，以应对不同通信业务对网络性能的需求的差异。为了保证网络业务的连续性，网络切片需要具备一定的自愈恢复能力，在网络切片故障造成网络中断之前，使故障自愈修复。With the development of communication technology, especially the popularization and application of the 5th-generation mobile communication (the 5th-generation, 5G) technology, the concept of network slicing is introduced to cope with the difference in network performance requirements of different communication services. In order to ensure the continuity of network services, network slicing needs to have certain self-healing recovery capabilities, so that faults can be self-healed before network slicing faults cause network interruption.

本发明提出了一种可以应用于5G网络切片的网络切片故障自愈方法，基于双智能体的强化学习和混沌工程，通过主动制造故障生成故障动作和自愈动作，利用故障动作和自愈动作双智能体之间的对抗，提高切片自愈的泛化能力，迫使生成的自愈动作不断提高自身的故障修复能力，进而提高网络切片的自愈恢复能力。具体地，首先获取网络切片的多维指标数据，并对获取的多维指标数据进行检测，以确定网络切片中是否存在异常。可知地，网络切片所在的网络环境中一般设有NSMF和NSSMF，NSMF负责管理网络切片，主要负责对网络切片的多维指标数据进行监控，网络切片一般包括终端设备、接入网、传输网、核心网和应用服务器，而网络切片中的接入网、传输网、核心网包括多个由NSSMF管理的网络子切片，如无线接入网子切片、传输网子切片、核心网子切片等。在本实施例中，获取网络切片的多维指标数据时，主要是从NSMF获取网络子切片的KPI(Key Performance Indicator，关键绩效指标或目标式量化管理指标)数据，包括无线接入网子切片、传输网子切片、核心网子切片等网络子切片的KPI，其中，无线接入网子切片的KPI包括无线接入网传输时延、上/下行用户平均吞吐率、上/下行小区平均吞吐率、CPU平均占用率、在线用户数、QoS流建立成功率、呼叫建立成功率等；传输网子切片的KPI包括传输网传输时延、带宽利用率、丢包率、数据传输量、误码率等；核心网子切片的KPI包括核心网传输时延、虚拟化存储资源利用率、虚拟化网络资源利用率、虚拟化计算资源利用率、错误码个数、请求成功率等。对获取的KPI数据进行检测，当检测到存在异常的KPI数据时，例如，某个KPI的值明显超出其正常范围，则说明网络切片中存在故障，或存在潜在的故障，需要在故障造成业务中断之前将其识别出来并自动对其进行修复，使网络切片的故障自愈，避免造成宕机等严重后果。The present invention proposes a network slicing fault self-healing method that can be applied to 5G network slicing. Based on dual-agent reinforcement learning and chaos engineering, fault actions and self-healing actions are generated by actively creating faults, and fault actions and self-healing actions are used The confrontation between the two agents improves the generalization ability of slice self-healing, and forces the generated self-healing actions to continuously improve its own fault repair ability, thereby improving the self-healing recovery ability of network slices. Specifically, the multi-dimensional index data of the network slice is acquired first, and the acquired multi-dimensional index data is detected to determine whether there is anomaly in the network slice. It can be seen that NSMF and NSSMF are generally installed in the network environment where network slicing is located. NSMF is responsible for managing network slicing, mainly responsible for monitoring multi-dimensional index data of network slicing. Network slicing generally includes terminal equipment, access network, transmission network, core network and application servers, while the access network, transmission network, and core network in network slicing include multiple network sub-slices managed by NSSMF, such as wireless access network sub-slices, transmission network sub-slices, and core network sub-slices. In this embodiment, when obtaining the multi-dimensional indicator data of the network slice, the KPI (Key Performance Indicator, key performance indicator or target-based quantitative management index) data of the network sub-slice is mainly obtained from the NSMF, including the wireless access network sub-slice, KPIs of network sub-slices such as transmission network sub-slices and core network sub-slices. KPIs of radio access network sub-slices include radio access network transmission delay, average uplink/downlink user throughput rate, and uplink/downlink cell average throughput rate , CPU average occupancy rate, number of online users, QoS flow establishment success rate, call establishment success rate, etc.; KPIs of transmission network sub-slices include transmission network transmission delay, bandwidth utilization rate, packet loss rate, data transmission volume, and bit error rate etc.; KPIs of core network sub-slices include core network transmission delay, utilization rate of virtualized storage resources, utilization rate of virtualized network resources, utilization rate of virtualized computing resources, number of error codes, request success rate, etc. Detect the acquired KPI data. When abnormal KPI data is detected, for example, the value of a certain KPI obviously exceeds its normal range, it indicates that there is a fault in the network slice, or there is a potential fault. Identify the interruption before it is interrupted and automatically repair it, so that the failure of the network slice can be self-healed and avoid serious consequences such as downtime.

步骤S20，当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；Step S20, when abnormal data is detected in the multi-dimensional index data of the network slice, generate slice self-healing actions and slice fault actions according to the multi-dimensional index data, and judge the generated slice self-healing actions and slice fault actions , to determine the behavior scores of the slice self-healing action and the slice failure action;

在本实施例中，网络切片故障自愈方法设有自愈模型，当检测到获取的多为指标数据中存在异常数据时，该模型可以根据获取的多维指标数据生成切片自愈动作和切片故障动作，然后对生成的故障动作和自愈动作进行评判，进而确定生成的切片自愈动作和切片故障动作的行为得分。在生成切片故障动作时，使模型向着切片故障动作的行为得分高的方向收敛，进而生成使网络切片难以修复的故障动作，而在生成切片自愈动作时，同样使模型向着切片自愈动作行为得分高的方向收敛，进而生成可以让网络切片故障修复的自愈动作。然后对生成的切片自愈动作和切片故障动作进行评判，根据切片自愈动作和切片故障动作的行为得分，使两者进行对抗，迫使生成的切片自愈动作不断提高自身的故障修复能力。In this embodiment, the network slicing fault self-healing method is provided with a self-healing model. When abnormal data is detected in the acquired index data, the model can generate slice self-healing actions and slice faults according to the acquired multi-dimensional index data. actions, and then evaluate the generated fault actions and self-healing actions, and then determine the behavior scores of the generated slice self-healing actions and slice fault actions. When generating slice failure actions, the model is converged toward the direction of the high score of the slice failure action, and then generates fault actions that make network slices difficult to repair. The directions with high scores converge to generate self-healing actions that can repair network slice failures. Then judge the generated slice self-healing action and slice fault action, and make the two confront each other according to the behavior scores of the slice self-healing action and slice fault action, forcing the generated slice self-healing action to continuously improve its fault repair ability.

步骤S20的细化，包括步骤A1-A2：The refinement of step S20, including steps A1-A2:

步骤A1，将所述多维指标数据输入预设的目标自愈模型，其中，所述目标自愈模型是利用网络切片的历史多维指标数据对预设的待训练自愈模型进行迭代训练得到的，所述目标自愈模型包括动作生成器；Step A1, inputting the multi-dimensional index data into the preset target self-healing model, wherein the target self-healing model is obtained by iteratively training the preset self-healing model to be trained by using the historical multi-dimensional index data of the network slice, The target self-healing model includes an action generator;

步骤A2，根据所述多维指标数据，利用所述目标自愈模型的动作生成器生成切片自愈动作和切片故障动作。Step A2, using the action generator of the target self-healing model to generate a slice self-healing action and a slice fault action according to the multi-dimensional index data.

进一步地，在生成切片自愈动作和切片故障动作时，可以先对获取的多维指标数据进行预处理，该预处理包括但不限于归一化处理，以归一化处理为例，将获取的指标数据按比例缩放，统一映射到一个较小的区间，也即，将数据缩放至给定的最小值与最大值之间，通常是0与1之间，得到目标指标数据，后续在对得到的目标指标数据进行处理时，可以提升数据的收敛速度和数据处理的精度。将经过预处理得到的多维指标数据输入到预设的目标自愈模型中，该目标自愈模型是利用网络切片的历史多维指标数据对待训练的基础自愈模型进行迭代训练得到的，该模型中包括双智能体的动作生成器，可以根据输入的多维指标数据，分别生成切片自愈动作和切片故障动作。Furthermore, when generating slice self-healing actions and slice fault actions, the acquired multi-dimensional index data can be preprocessed first. The preprocessing includes but is not limited to normalization processing. Taking normalization processing as an example, the acquired The indicator data is scaled proportionally, and uniformly mapped to a smaller interval, that is, the data is scaled to a given minimum and maximum value, usually between 0 and 1, to obtain the target indicator data, and then to obtain When the target index data is processed, the convergence speed of data and the accuracy of data processing can be improved. Input the preprocessed multidimensional index data into the preset target self-healing model. The target self-healing model is obtained by iterative training of the basic self-healing model to be trained by using the historical multidimensional index data of the network slice. The action generator including dual agents can generate slice self-healing actions and slice fault actions respectively according to the input multi-dimensional index data.

进一步地，动作生成器包括自愈动作生成器和故障动作生成器，步骤A2中，根据多维指标数据，利用目标自愈模型的动作生成器生成切片自愈动作和切片故障动作的步骤的细化，包括步骤A21-A23：Further, the action generator includes a self-healing action generator and a fault action generator. In step A2, according to the multidimensional index data, the step of generating slice self-healing action and slice fault action by using the action generator of the target self-healing model is refined , including steps A21-A23:

步骤A21，根据所述多维指标数据确定所述网络切片当前的全信息状态，其中，所述全信息状态包括所述自愈动作生成器可观测的第一状态，以及所述故障动作生成器可观测的第二状态；Step A21, determine the current full information state of the network slice according to the multi-dimensional index data, wherein the full information state includes the first state observable by the self-healing action generator, and the fault action generator can the second state of observation;

步骤A22，将所述全信息状态中的第一状态输入至所述自愈动作生成器中，利用所述自愈动作生成器生成切片自愈动作；Step A22, input the first state in the full information state into the self-healing action generator, and use the self-healing action generator to generate a slice self-healing action;

步骤A23，将所述全信息状态中的第二状态输入至所述故障动作生成器中，利用所述故障动作生成器生成切片故障动作。Step A23, inputting the second state in the full information state into the fault action generator, and using the fault action generator to generate slice fault actions.

在本实施例中，目标自愈模型的动作生成器包括自愈动作生成器(actor1)和故障动作生成器(actor2)，利用预设的目标自愈模型的动作生成器生成切片自愈动作和切片故障动作时，首先根据输入的目标指标数据确定网络切片当前的全信息状态，该全信息状态包括actor1可以观测到的第一状态(s₁)和actor2可以观测到的第二状态(s₂)。然后利用actor1根据其可以观测到的网络切片的状态s₁生成切片自愈动作，并利用actor2根据其能观测到的网络切片的状态s₂生成切片故障动作。In this embodiment, the action generator of the target self-healing model includes a self-healing action generator (actor1) and a fault action generator (actor2), and uses the preset action generator of the target self-healing model to generate slice self-healing actions and When a slice fails to act, first determine the current full information state of the network slice according to the input target index data. The full information state includes the first state (s ₁ ) that actor1 can observe and the second state (s ₂ ). Then use actor1 to generate slice self-healing actions according to the state s ₁ of the network slice it can observe, and use actor2 to generate slice failure actions according to the state s ₂ of the network slice it can observe.

步骤S30，根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。Step S30, perform iterative optimization on the generated slice self-healing action and slice fault action according to the behavior score to determine a target self-healing action, and use the target self-healing action to perform fault repair on the network slice.

利用目标自愈模型中，不同的动作生成器根据其自身可以观测到的网络切片的部分状态生成对应的动作，对生成的切片自愈动作和切片故障动作进行评判，确定生成的动作的行为得分，根据该行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，迫使自愈动作生成器生成具有更高的故障修复能力的切片自愈动作，进而确定可以使网络切片故障自愈的目标自愈动作。利用目标自愈动作对网络切片进行故障修复，以使网络切片的故障自愈。其中，对网络切片进行故障修复的方式，包括但不限于根据确定的目标自愈动作，对网络切片的状态进行切换。Using the target self-healing model, different action generators generate corresponding actions according to the partial states of the network slices that they can observe, evaluate the generated slice self-healing actions and slice failure actions, and determine the behavior scores of the generated actions , iteratively optimizes the generated slice self-healing actions and slice fault actions according to the behavior score, forcing the self-healing action generator to generate slice self-healing actions with higher fault repair capabilities, and then determine the network slice fault self-healing action Target self-healing action. The fault repair of the network slice is performed by using the target self-healing action to make the fault self-healing of the network slice. Wherein, the way of repairing the failure of the network slice includes but not limited to switching the state of the network slice according to the determined target self-healing action.

在本实施例中，通过获取网络切片的多维指标数据，并对获取的多维指标数据进行检测；当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。当检测到网络切片的多为指标数据存在异常时，通过生成切片故障动作主动制造故障，并与生成的切片自愈动作进行对抗，迫使生成的切片自愈动作提高自身的故障修复能力，进而提高了网络切片自愈策略的故障自愈性能，能够及时识别并修复网络切片中的故障问题，在网络切片故障造成中断之前使故障自愈恢复，避免造成严重后果，提高了网络切片的自愈恢复能力。In this embodiment, by acquiring the multi-dimensional index data of the network slice, and detecting the acquired multi-dimensional index data; when abnormal data is detected in the multi-dimensional index data of the network slice, a slice is generated according to the multi-dimensional index data Self-healing actions and slice fault actions, and evaluate the generated slice self-healing actions and slice fault actions to determine the behavior scores of the slice self-healing actions and slice fault actions; The self-healing action and the slice failure action are iteratively optimized to determine a target self-healing action, and the network slice is repaired by using the target self-healing action. When it is detected that there is an anomaly in most of the index data of the network slice, it will actively create a fault by generating a slice fault action, and fight against the generated slice self-healing action, forcing the generated slice self-healing action to improve its own fault repair ability, thereby improving The fault self-healing performance of the network slicing self-healing strategy is improved, and the fault problem in the network slicing can be identified and repaired in time, and the fault self-healing can be recovered before the network slicing fault causes interruption, avoiding serious consequences, and improving the self-healing recovery of the network slicing ability.

进一步地，在本发明上述实施例的基础上，提出了本发明网络切片故障自愈方法的第二实施例。Further, on the basis of the above-mentioned embodiments of the present invention, a second embodiment of the network slice fault self-healing method of the present invention is proposed.

本实施例是第一实施例中步骤S20细化的步骤，本实施例与本发明上述实施例的区别在于，本实施中的目标自愈模型还包括包括自愈动作评判器和故障动作评判器，上述实施例步骤S20中，对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分的细化：包括步骤B1-B3：This embodiment is a refinement of step S20 in the first embodiment. The difference between this embodiment and the above-mentioned embodiments of the present invention is that the target self-healing model in this embodiment also includes a self-healing action evaluator and a fault action evaluator. , in step S20 of the above-mentioned embodiment, evaluating the generated slice self-healing action and slice failure action to determine the refinement of the behavior scores of the slice self-healing action and the slice failure action: including steps B1-B3:

步骤B1，将所述全信息状态中的第一状态、所述切片自愈动作和所述切片故障动作输入至所述目标自愈模型中的自愈动作评判器中，以对所述切片自愈动作进行评判，确定所述切片自愈动作的行为得分；Step B1, input the first state in the full information state, the slice self-healing action and the slice failure action into the self-healing action evaluator in the target self-healing model, so as to evaluate the slice self-healing action The healing action is judged to determine the behavior score of the slice self-healing action;

步骤B2，根据所述切片故障动作确定故障破坏半径；Step B2, determining the fault damage radius according to the slice fault action;

步骤B3，将所述全信息状态中的第二状态、所述故障破坏半径、所述切片自愈动作和所述切片故障动作输入至所述目标自愈模型中的故障动作评判器中，以对所述切片故障动作进行评判，确定所述切片故障动作的行为得分。Step B3, inputting the second state in the full information state, the fault damage radius, the slice self-healing action and the slice fault action into the fault action evaluator in the target self-healing model to Evaluate the slice fault action, and determine the behavior score of the slice fault action.

基于上述实施例，在本实施例中，对生成的切片自愈动作和切片故障动作进行评判时，是将网络切片的全信息状态和生成的切片自愈动作、切片故障动作等输入到目标自愈模型的评判器中进行评判，进而确定生成的切片自愈动作和切片故障动作的行为得分的。具体地，目标自愈模型还包括自愈动作评判器和故障动作评判器，将生成的切片自愈动作(a₁)、自愈动作生成器actor1可以观测到的网络切片的状态s₁以及生成的切片故障动作(a₂)输入到自愈动作评判器(critic1)中对切片自愈动作进行评判，得到切片自愈动作a₁的行为得分Q₁(s₁,a₁,a₂)。根据生成的切片故障动作确定故障破坏半径(c₂)，将故障动作生成器actor2可以观测到的网络切片的状态s₂、故障破坏半径c₂、切片故障动作a₂以及切片自愈动作a₁输入至故障动作评判器(critic2)中对切片故障动作进行评判，得到切片故障动作a₂的行为得分Q₂(s₂,c₂,a₂,a₁)。Based on the above-mentioned embodiments, in this embodiment, when judging the generated slice self-healing actions and slice fault actions, the full information state of the network slice and the generated slice self-healing The evaluation is performed in the evaluator of the healing model, and then the behavior scores of the generated slice self-healing actions and slice fault actions are determined. Specifically, the target self-healing model also includes a self-healing action evaluator and a failure action evaluator, the slice self-healing action (a ₁ ) to be generated, the state s ₁ of the network slice that the self-healing action generator actor1 can observe, and the generated The slice fault action (a ₂ ) of the slice is input into the self-healing action evaluator (critic1) to judge _the slice self-healing action, and obtain the behavior score Q ₁ (s ₁ ,a ₁ ,a ₂ ) of the slice self-healing action a 1 . Determine the fault damage radius (c ₂ ) according to the generated slice fault actions, and combine the network slice state s ₂ , fault damage radius c ₂ , slice fault action a ₂ and slice self-healing action a ₁ that can be observed by the fault action generator actor2 Input to the fault action evaluator (critic2) to judge the slice fault action, and obtain the behavior score _{Q 2} ₍ s ₂ , c ₂ , a ₂ , a ₁ ) of the slice fault action a 2 .

进一步地，上述实施例中步骤S30的细化，还包括步骤C1-C2：Further, the refinement of step S30 in the above embodiment also includes steps C1-C2:

步骤C1，将所述切片自愈动作的行为得分反馈至所述自愈动作生成器，并将所述切片故障动作的行为得分反馈至所述故障动作生成器，返回并执行所述利用所述目标自愈模型中的自愈动作生成器生成切片自愈动作，并利用所述目标自愈模型中的故障动作生成器生成切片故障动作的步骤，以对所述切片自愈动作和所述切片故障动作进行优化更新，直到所述切片自愈动作的行为得分和所述切片故障动作的行为得分满足预设条件，得到目标自愈动作；Step C1, feed back the behavior score of the slice self-healing action to the self-healing action generator, and feed back the behavior score of the slice fault action to the fault action generator, return and execute the The self-healing action generator in the target self-healing model generates a slice self-healing action, and uses the fault action generator in the target self-healing model to generate a slice fault action, so that the slice self-healing action and the slice The fault action is optimized and updated until the behavior score of the slice self-healing action and the behavior score of the slice fault action meet a preset condition, and a target self-healing action is obtained;

步骤C2，将所述网络切片从当前的全信息状态切换至所述目标自愈动作对应的目标全信息状态，以对所述网络切片进行故障修复。Step C2, switch the network slice from the current full information state to the target full information state corresponding to the target self-healing action, so as to repair the fault of the network slice.

进一步地，在确定切片自愈动作和切片故障动作的行为得分之后，根据该行为得分确定目标自愈动作，从而对网络切片进行故障修复时，需要将切片自愈动作和切片故障动作的行为得分分别反馈至对应的动作生成器actor1和actor2中，返回并执行利用所述目标自愈模型中的自愈动作生成器生成切片自愈动作，并利用所述目标自愈模型中的故障动作生成器生成切片故障动作的步骤，生成新的切片自愈动作和切片故障动作。然后对新生成的切片自愈动作和切片故障动作进行评判，通过DNQ(Deep Q-Learning，深度Q学习)循环结构，不断对生成的切片自愈动作和切片故障动作进行优化更新，形成actor1和actor2双智能体的对抗，提高自愈模型的稳定性和收敛性。直到生成的切片自愈动作和切片故障动作满足预设条件，确定目标自愈动作，该预设条件可以是切片自愈动作的行为得分最高，且能够使切片故障动作产生的故障修复。Furthermore, after determining the behavior scores of slice self-healing actions and slice failure actions, the target self-healing actions are determined according to the behavior scores, so that when performing fault repairs on network slices, the behavior scores of slice self-healing actions and slice failure actions need to be Respectively feed back to the corresponding action generators actor1 and actor2, return and execute using the self-healing action generator in the target self-healing model to generate slice self-healing actions, and use the fault action generator in the target self-healing model The step of generating a slice fault action is to generate a new slice self-healing action and a slice fault action. Then judge the newly generated slice self-healing action and slice fault action, and continuously optimize and update the generated slice self-healing action and slice fault action through the DNQ (Deep Q-Learning, deep Q-learning) loop structure to form actor1 and Actor2 dual-agent confrontation improves the stability and convergence of the self-healing model. The target self-healing action is determined until the generated slice self-healing action and slice fault action meet the preset condition. The preset condition may be that the slice self-healing action has the highest behavior score and can repair the fault generated by the slice fault action.

参照图3，图3是本实施例中目标自愈模型生成的切片自愈动作和切片故障动作的对抗过程示意图，在图3中，网络切片输出一个全信息状态s_all后，actor1和actor2只能获取自身能够观测到的部分状态信息，而critic1和critic2可以获得全信息状态，同时还能获得两个agent(智能体)即actor1和actor2采取的策略动作。也即，不同的actor虽然不能获取到全部信息，也无法获取到其他actor的策略，但是每个actor对应的评判器可以观测到所有信息，并指导对应的actor优化自身策略。具体地，将评判器critic1和critic2得出的行为得分，即价值Q₁和Q₂分别反馈给动作生成器actor1和actor2，动作生成器actor1和actor2根据当前行为得分对应的确定的行为策略确定当前生成的动作是否为最优策略，如果不是，则生成新的动作以对自身的策略进行优化更新。其中，动作生成器actor1和actor2的确定的行为策略即DPG(Deterministic Policy Gradient，确定性策略梯度)，使其自身每一步的行为通过函数μ都可以直接获得确定的值，DPG如下列公式1所示：Referring to Figure 3, Figure 3 is a schematic diagram of the confrontation process of slice self-healing actions and slice failure actions generated by the target self-healing model in this embodiment. In Figure 3, after the network slice outputs a full information state s _all , actor1 and actor2 only It can obtain part of the state information that can be observed by itself, while critic1 and critic2 can obtain the full information state, and at the same time obtain the strategic actions taken by the two agents (intelligent bodies), namely actor1 and actor2. That is to say, although different actors cannot obtain all the information, nor can they obtain the strategies of other actors, but the evaluator corresponding to each actor can observe all the information and guide the corresponding actor to optimize its own strategy. Specifically, the behavior scores obtained by the critic1 and critic2, that is, the values _Q1 and _Q2 are respectively fed back to the action generators actor1 and actor2, and the action generators actor1 and actor2 determine the current behavior according to the determined behavior strategy corresponding to the current behavior score Whether the generated action is the optimal policy, if not, generate a new action to optimize and update its own policy. Among them, the deterministic behavior policy of the action generators actor1 and actor2 is DPG (Deterministic Policy Gradient, deterministic policy gradient), so that each step of its own behavior can directly obtain a definite value through the function μ. DPG is shown in the following formula 1 Show:

a_t＝μ(s_t|θ^μ) (1)a _t = μ(s _t |θ ^μ ) (1)

在公式1中，a_t为t时刻所选择的动作，s_t为t时刻环境的状态，θ^μ为权重值，该函数μ即最优行为策略，对其进行训练可以得到一个确定性的最优行为策略函数，确定性策略梯度公式如下列公式2所示：In formula 1, a _t is the action selected at time t, s _t is the state of the environment at time t, θ ^μ is the weight value, and this function μ is the optimal behavior strategy, and training it can obtain a deterministic optimal behavior strategy. The optimal behavior policy function, the deterministic policy gradient formula is shown in the following formula 2:

其中，表示梯度，需要说明的是，本发明中的确定性策略是基于DPG的DDPG(DeepDeterministic Policy Gradient，深度确定性策略梯度)，将深度学习神经网络与DPG的策略学习方法融合，即把价值函数和策略函数都通过神经网络来表达。相对于DPG，DDPG采用神经网络模拟策略函数μ和Q函数，即策略网络和Q网络，然后利用深度学习的方法对上述神经网络进行训练，得到目标自愈模型的动作生成器actor1和actor2选择生成动作的确定性策略梯度。in, Indicates the gradient. It should be noted that the deterministic policy in the present invention is based on DPG's DDPG (DeepDeterministic Policy Gradient, deep deterministic policy gradient), which combines the deep learning neural network with the policy learning method of DPG, that is, the value function and Policy functions are expressed through neural networks. Compared with DPG, DDPG uses the neural network to simulate the policy function μ and Q function, that is, the policy network and Q network, and then uses the method of deep learning to train the above neural network, and obtains the action generator actor1 and actor2 of the target self-healing model to select and generate Deterministic policy gradient for actions.

进一步地，上述公式1-2所示的确定性策略梯度仅为本实施例的一种优选策略，用于对本发明实施例进行说明，并不用于限定本发明。在本实施例中，根据评判器得到的行为得分对生成的切片自愈动作和切片故障动作进行优化更新，进而确定目标自愈动作后，根据目标自愈动作对应的网络切片的目标全信息状态，将网络切片从当前的全信息状态切换至目标全信息状态，以完成对网络切片的故障修复。其中，对网络切片全信息状态的切换例如，无线网子切片：切换至备用业务信道；传输网子切片：切换至备用的逻辑端口；核心网子切片：切换至备用的虚拟机等。Further, the deterministic policy gradient shown in the foregoing formula 1-2 is only a preferred policy of this embodiment, and is used to illustrate the embodiment of the present invention, and is not used to limit the present invention. In this embodiment, the generated slice self-healing action and slice failure action are optimized and updated according to the behavior score obtained by the evaluator, and then after the target self-healing action is determined, according to the target full information state of the network slice corresponding to the target self-healing action , switch the network slice from the current full information state to the target full information state, so as to complete the fault repair of the network slice. Among them, the switching of the full information state of the network slice is, for example, the wireless network sub-slice: switch to the standby service channel; the transmission network sub-slice: switch to the standby logical port; the core network sub-slice: switch to the standby virtual machine, etc.

在本实施例中，通过控制双智能体生成切片自愈动作和切片故障动作，并利用评判器对生成的切片自愈动作和切片故障动作进行评判，基于确定性策略梯度，根据评判的行为得分对生成的切片自愈动作和切片故障动作进行优化更新，使动作生成器的双智能体的行为产生对抗动作，迫使自愈动作生成器生成故障修复能力更高的自愈动作，进而确定最优的行为策略即目标自愈动作，提高了自愈策略的泛化能力，根据确定的目标自愈动作对网络切片的状态进行切换，提高了网络切片的故障自愈能力。In this embodiment, the dual-agents are controlled to generate slice self-healing actions and slice fault actions, and the evaluator is used to judge the generated slice self-healing actions and slice fault actions. Based on the deterministic policy gradient, the evaluated behavior scores Optimize and update the generated slice self-healing actions and slice fault actions, so that the behavior of the dual agents of the action generator can generate confrontation actions, forcing the self-healing action generator to generate self-healing actions with higher fault repair capabilities, and then determine the optimal The behavior strategy is the target self-healing action, which improves the generalization ability of the self-healing strategy, switches the state of the network slice according to the determined target self-healing action, and improves the fault self-healing ability of the network slice.

基于上述实施例，提出本发明网络切片故障自愈方法的第三实施例，本实施例是上述实施例中，步骤S30之后的步骤，在步骤S30之后，还包括步骤D1-D2：Based on the above-mentioned embodiments, a third embodiment of the network slicing fault self-healing method of the present invention is proposed. This embodiment is the step after step S30 in the above-mentioned embodiment. After step S30, steps D1-D2 are also included:

步骤D1，将所述目标自愈动作输入预设的奖励函数，得到所述目标自愈动作对应的奖励值；Step D1, inputting the target self-healing action into a preset reward function to obtain the reward value corresponding to the target self-healing action;

步骤D2，根据所述奖励值生成经验回放数据集合，并根据所述经验回放数据集合对所述目标自愈模型的模型参数进行更新。Step D2, generating an experience replay data set according to the reward value, and updating the model parameters of the target self-healing model according to the experience replay data set.

基于上述实施例，在本实施例中，根据确定的目标自愈动作对网络切片进行故障修复之后，将目标自愈动作输入到预设的奖励函数中，从而得到该目标自愈动作对应的奖励值，根据该奖励值生成经验回放数据集合，并利用生成的经验回放数据集合对目标自愈模型的模型参数进行更新。Based on the above-mentioned embodiment, in this embodiment, after the network slice is repaired according to the determined target self-healing action, the target self-healing action is input into the preset reward function, so as to obtain the reward corresponding to the target self-healing action Value, according to the reward value to generate the experience replay data set, and use the generated experience replay data set to update the model parameters of the target self-healing model.

具体地，目标自愈模型的动作生成器和评判器都包括估计网络和目标网络，生成的经验回放数据如(s,a,r,s')，其中s是目标自愈动作切换之前，网络切片的全信息状态，s'是目标自愈动作切换之后，网络切片的全信息状态，a＝(a₁，a₂)是actor1和actor2基于网络切片的全信息状态时所做的行为、r表示actor做出了行为之后从环境中获得的收益即奖励值，生成的经验回放数据集合中记录着网络切片每一个状态下的动作、奖励、和下一个状态的结果(s,a,r,s')。当经验回放数据集合有一定的存储容量，当数据存放满时，可以对最先存放进去的数据进行覆盖。Specifically, both the action generator and the evaluator of the target self-healing model include an estimation network and a target network, and the generated experience playback data such as (s, a, r, s'), where s is the network before the target self-healing action switch The full information state of the slice, s' is the full information state of the network slice after the target self-healing action is switched, a=(a ₁ , a ₂ ) is the behavior of actor1 and actor2 based on the full information state of the network slice, r Indicates the reward value obtained by the actor from the environment after performing the behavior. The generated experience playback data set records the actions, rewards, and results of the next state in each state of the network slice (s, a, r, s'). When the experience playback data set has a certain storage capacity, when the data storage is full, the first stored data can be overwritten.

利用经验回放数据集合对模型参数进行更新时，主要是对估计网络进行更新，目标网络的参数可以通过对估计网络参数的跟踪获取，在对估计网络进行更新时，首先初始化动作生成器actor网络μ(sθ^μ)和评判器critic网络Q(s,a|θ^Q)，初始化权重值分别为θ^μ和θ^Q，然后初始化目标网络Q'＝Q(s,a|θ^Q)和μ'＝μ(s|θ^μ)，从经验回放数据集合中随机选取N个经验回放数据，并按照下列公式3所示的目标函数对评判器的参数进行更新：When using the experience replay data set to update the model parameters, it is mainly to update the estimated network. The parameters of the target network can be obtained by tracking the estimated network parameters. When updating the estimated network, first initialize the action generator actor network μ (sθ ^μ ) and the critic network Q(s,a|θ ^Q ), initialize the weight values as θ ^μ and θ ^Q respectively, and then initialize the target network Q'=Q(s,a|θ ^Q ) and μ'= μ(s|θ ^μ ), randomly select N pieces of experience replay data from the experience replay data set, and update the parameters of the evaluator according to the objective function shown in the following formula 3:

y_i＝r_i+γQ'(s_i+1,μ'(s_i+1|θ^μ′|θ^Q′)) (3)y _i ＝r _i +γQ'(s _i+1 ,μ'(s _i+1 |θ ^μ′ |θ ^Q′ )) (3)

其中，y_i代表目标网络，r_i代表奖励值，θ^μ'和θ^Q'代表目标权重值，γ代表折扣因子，通过最小化下列公式4所示的损失函数，对评判器参数进行更新：Among them, y _i represents the target network, _ri represents the reward value, θ ^μ' and θ ^Q' represent the target weight value, γ represents the discount factor, and the parameters of the evaluator are updated by minimizing the loss function shown in the following formula 4:

在公式4中，L是损失函数，在本实施例中，Q值是基于现实的和估计的Q值的平方损失得到的，估计的Q值根据当前的状态s和动作估计网络输出的动作a输入估计网络得到，而现实的Q值根据现实的奖励r和Q值的折现值相加得到，现实的Q值将下一时刻的状态s'和动作现实网络得到的动作a'输入到现实网络得到。通过最小化损失函数，对评判器参数进行更新。而对于动作生成器actor的更新，基于上述公式1所示的确定性策略梯度，采用确定性策略梯度更新actor网络，actor的目的是尽量得到一个高Q值的动作，因此，actor的损失可以简单的理解为得到的反馈Q值越大损失越小，得到的反馈Q值越小损失越大，因此，利用actor自身的参数梯度结合评判器的动作梯度/>按照下列公式5所示的梯度更新方式，对actor网络的梯度/>进行更新，使actor朝着更有可能获取比较大的Q值的方向更新自身参数。In formula 4, L is the loss function. In this embodiment, the Q value is obtained based on the square loss of the actual and estimated Q values. The estimated Q value is based on the current state s and the action a outputted by the action estimation network. The input estimation network is obtained, and the actual Q value is obtained by adding the actual reward r and the discounted value of the Q value. The actual Q value inputs the state s' of the next moment and the action a' obtained by the action reality network into the reality The network gets. The parameters of the evaluator are updated by minimizing the loss function. For the update of the actor of the action generator, based on the deterministic policy gradient shown in the above formula 1, the deterministic policy gradient is used to update the actor network. The purpose of the actor is to obtain an action with a high Q value as much as possible. Therefore, the loss of the actor can be as simple as The understanding is that the larger the feedback Q value obtained, the smaller the loss, and the smaller the feedback Q value obtained, the greater the loss. Therefore, using the parameter gradient of the actor itself Action Gradients Combined with Evaluator /> According to the gradient update method shown in the following formula 5, the gradient of the actor network /> Update to make the actor update its own parameters in the direction that is more likely to obtain a larger Q value.

最后，通过对估计网络的参数更新进行跟踪学习，按照下列公式6更新目标网络参数，公式6为critic和actor目标网络的权重更新方式：Finally, by tracking and learning the parameter update of the estimated network, the target network parameters are updated according to the following formula 6. Formula 6 is the weight update method of the critic and actor target network:

θ′_i←τθ_i+(1-τ)θ′_i (6)θ′ _i ←τθ _i +(1-τ)θ′ _i (6)

在本实施例中，将τ设置为一个非常接近1的数，这样目标网络的参数θ就不会产生较大的波动，从而提高模型的稳定性。In this embodiment, τ is set as a number very close to 1, so that the parameter θ of the target network will not fluctuate greatly, thereby improving the stability of the model.

步骤S20之前，还包括步骤E1-E3：Before step S20, steps E1-E3 are also included:

步骤E1，获取网络切片的历史指标数据并进行预处理，得到样本数据；Step E1, obtaining the historical indicator data of the network slice and performing preprocessing to obtain sample data;

步骤E2，获取模型架构参数，并根据所述模型架构参数建立基础自愈模型；Step E2, obtaining model architecture parameters, and establishing a basic self-healing model according to the model architecture parameters;

步骤E3，利用所述样本数据对所述基础自愈模型进行迭代训练，得到目标自愈模型。Step E3, using the sample data to iteratively train the basic self-healing model to obtain a target self-healing model.

更进一步地，基于上述实施例，在根据经过与处理的目标指标数据生成切片自愈动作和切片故障动作之前，需要对模型进行预训练。在对模型进行预训练时，首先是获取网络切片的历史多维指标数据并进行预处理，得到模型预训练的样本数据，然后获取策略函数、评价函数和模型架构参数，搭建基础的自愈模型，然后利用样本数据对基础模型进行迭代训练，得到目标自愈模型。其中，模型的架构参数包括模型的神经网络的层级数量，以及每层神经网络设置的神经元的数量等。Furthermore, based on the above embodiments, before generating the slice self-healing action and the slice fault action according to the processed target index data, the model needs to be pre-trained. When pre-training the model, first obtain the historical multi-dimensional index data of the network slice and perform preprocessing to obtain the sample data for model pre-training, then obtain the policy function, evaluation function and model architecture parameters, and build a basic self-healing model. Then use the sample data to iteratively train the basic model to obtain the target self-healing model. Among them, the architectural parameters of the model include the number of layers of the neural network of the model, and the number of neurons set in each layer of the neural network.

基于上述实施例中的目标自愈模型，本实施例中搭建的基础自愈模型是双智能体DDPG模型，包含两个动作生成器(actor)和两个评判器(critic)，每个actor或critic都包含目标网络(target_net)和估计网络(eval_net)两个结构相同的神经网络，仅参数更新频率不同。以上述actor1和critic1为例，搭建的基础自愈模型如图4所示，图4为搭建的基础待训练自愈模型的自愈动作生成器和自愈动作评判器的神经网络层级架构示意图，在图4中，actor1的输入层用于输入当前切片网络的状态s₁，隐藏层包含3个全连接层(Dense)，分别设置256、128个神经元。在每一个全连接层之后均引入一个舍弃层，以有效避免模型过拟合，在舍弃层以概率p舍弃神经元并让其它神经元以概率q＝1-p保留，本实施例中设置舍弃概率p＝0.2，即随机忽略20％的神经元，使其失效。输出层为全连接层(Dense)：设置3个神经元，输出网络子切片的自愈动作，包括无线网子切片(切换至备用业务信道)、传输网子切片(切换至备用的逻辑端口)、核心网子切片(切换至备用的虚拟机)。Based on the target self-healing model in the above embodiments, the basic self-healing model built in this embodiment is a dual-agent DDPG model, including two action generators (actors) and two critics (critics), each actor or The critic contains two neural networks with the same structure, the target network (target_net) and the estimated network (eval_net), only the parameter update frequency is different. Taking actor1 and critic1 above as examples, the basic self-healing model built is shown in Figure 4. Figure 4 is a schematic diagram of the neural network hierarchy of the self-healing action generator and self-healing action evaluator of the basic self-healing model to be trained. In Figure 4, the input layer of actor1 is used to input the state s ₁ of the current slice network, and the hidden layer includes 3 fully connected layers (Dense), with 256 and 128 neurons respectively. After each fully connected layer, a discarding layer is introduced to effectively avoid model overfitting. In the discarding layer, neurons are discarded with probability p and other neurons are retained with probability q=1-p. In this embodiment, discarding is set Probability p=0.2, that is, randomly ignore 20% of neurons, making them invalid. The output layer is a fully connected layer (Dense): set 3 neurons to output the self-healing action of the network sub-slice, including the wireless network sub-slice (switched to the standby service channel), the transmission network sub-slice (switched to the standby logical port) , Core network sub-slicing (switching to the standby virtual machine).

评判器critic1分别设置两个输入层，一个输入层1用于接收最近T时间段内网络切片的全信息状态，另一个输入层2用于接收生成的切片自愈动作和切片故障动作，在输入层1下设置两个全连接层(Dense)，分别设置256、128个神经元，在输入层2下设置1个全连接层(Dense)，并设置16个神经元，然后通过合并层(merge)来合并动作和状态，最后设置一个包含128个神经元的全连接层和一个只包含1个神经元的输出层，用于最终输出评价此次动作选择的Q₁(s₁,a₁,a₂)值。Criter critic1 sets up two input layers respectively, one input layer 1 is used to receive the full information state of the network slice in the latest T time period, and the other input layer 2 is used to receive the generated slice self-healing action and slice failure action. Set up two fully connected layers (Dense) under layer 1, respectively set 256 and 128 neurons, set up a fully connected layer (Dense) under input layer 2, and set 16 neurons, and then pass the merge layer (merge ) to combine actions and states, and finally set up a fully connected layer containing 128 neurons and an output layer containing only 1 neuron for the final output of Q ₁ (s ₁ ,a ₁ , a ₂ ) value.

在搭建好基础自愈模型的神经网络层级结构后，利用获取的样本数据对搭建的及出自愈模型进行训练，在训练时，分别训练两个智能体actor，具体地，将包括无线接入网子切片、传输网子切片、核心网子切片的历史网络切片多维度KPI(s₁)输入至由全连接神经网络构成的actor1中，输出所生成的该网络切片对应的自愈动作。然后将网络切片的状态KPI(s₁)、所生成的切片自愈动作(a₁)、切片故障动作(a₂)输入至由多分支全连接神经网络构成的评判器critic1中，输出评价此次动作选择的Q值Q₁(s₁,a₁,a₂)。将Q₁值反馈给actor1，actor1根据Q₁值来选取能够最大程度使业务影响降至最低的自愈动作，训练收敛后的actor1模型权重即可作为切片自愈动作生成器。After building the neural network hierarchical structure of the basic self-healing model, use the obtained sample data to train the built and self-healing models. During the training, two intelligent actors are trained separately. Specifically, wireless access will be included. The historical network slice multi-dimensional KPI (s ₁ ) of the network sub-slice, transmission network sub-slice, and core network sub-slice is input to actor1 composed of a fully connected neural network, and the generated self-healing action corresponding to the network slice is output. Then input the state KPI (s ₁ ) of the network slice, the generated slice self-healing action (a ₁ ), and the slice fault action (a ₂ ) into the critic1 composed of a multi-branch fully connected neural network, and output the evaluation of this The Q value Q ₁ (s ₁ , a ₁ , a ₂ ) of the action selection. Feedback the Q ₁ value to actor1, and actor1 selects the self-healing action that can minimize the business impact to the greatest extent according to the Q ₁ value, and the actor1 model weight after training convergence can be used as the slice self-healing action generator.

然后将包括无线接入网子切片、传输网子切片、核心网子切片KPI的历史网络切片多维度KPI(s₂)及故障破坏半径(c₂)输入至由全连接神经网络构成的actor2中，输出所生成的该切片故障生成动作(a₂)。将网络切片状态KPI(s₂)、故障破坏半径(c₂)、生成的切片自愈动作(a₁)、切片故障动作(a₂)输入至由多分支全连接神经网络构成的评判器critic2中，输出评价此次动作选择的Q值Q₂(s₂,c₂a₂,a₁)。将Q₂值反馈给动作生成器，actor2根据Q₂值来选取能够在一定故障破坏半径下最大程度使切片难以自愈的切片故障动作。训练收敛后的actor2模型权重即可作为切片故障动作生成器。在模型预训练过程中，保存训练过程中所有阶段的经验回放数据(s,a,r,s')，形成经验回放数据集合，用于对模型运行过程中的参数更新的基础数据集合，利用样本数据完成对搭建的基础自愈模型的迭代训练，即得到目标自愈模型。Then input the historical network slice multi-dimensional KPI (s ₂ ) and fault damage radius (c ₂ ) including the radio access network sub-slice, transmission network sub-slice, and core network sub-slice KPI into actor2 composed of a fully connected neural network , output the generated fault generation action (a ₂ ) for this slice. Input network slice status KPI (s ₂ ), fault damage radius (c ₂ ), generated slice self-healing action (a ₁ ), and slice fault action (a ₂ ) to the critic2 composed of multi-branch fully connected neural network In , output the Q value Q ₂ (s ₂ ,c ₂ a ₂ ,a ₁ ) for evaluating this action choice. The Q ₂ value is fed back to the action generator, and actor2 selects the slice fault action that can make the slice difficult to self-heal to the greatest extent under a certain fault damage radius according to the Q ₂ value. The actor2 model weights after training convergence can be used as slice fault action generators. In the process of model pre-training, the experience playback data (s, a, r, s') of all stages in the training process are saved to form an experience playback data set, which is used to update the basic data set of parameters during the model running process. The sample data completes the iterative training of the built basic self-healing model, that is, the target self-healing model is obtained.

在本实施例中，通过利用网络切片的历史多维度指标数据对搭建的基础自愈模型进行迭代训练，得到目标自愈模型，并在根据目标自愈动作完成对网络切片的故障修复后，利用目标自愈动作生成经验回放数据集合，进而对模型参数进行更新，提高了目标自愈模型的自愈性能。In this embodiment, the built basic self-healing model is iteratively trained by using the historical multi-dimensional index data of the network slice to obtain the target self-healing model, and after completing the fault repair of the network slice according to the target self-healing action, use The target self-healing action generates an experience playback data set, and then updates the model parameters, improving the self-healing performance of the target self-healing model.

此外，参照图5，本发明实施例还提出一种网络切片故障自愈装置，所述网络切片故障自愈装置包括：In addition, referring to FIG. 5 , an embodiment of the present invention also proposes a network slice fault self-healing device, and the network slice fault self-healing device includes:

数据检测模块10，用于获取网络切片的多维指标数据，并对获取的多维指标数据进行检；A data detection module 10, configured to obtain multidimensional index data of network slices, and inspect the obtained multidimensional index data;

动作生成模块20，用于当检测到所述网络切片的多维指标数据中存在异常数据时，根据所述多维指标数据生成切片自愈动作和切片故障动作，并对生成的切片自愈动作和切片故障动作进行评判，以确定所述切片自愈动作和所述切片故障动作的行为得分；The action generation module 20 is configured to generate a slice self-healing action and a slice fault action according to the multi-dimensional index data when abnormal data is detected in the multi-dimensional index data of the network slice, and to generate slice self-healing actions and slice fault actions. The fault action is judged to determine the behavior score of the slice self-healing action and the slice fault action;

双智能体对抗模块30，用于根据所述行为得分对生成的切片自愈动作和切片故障动作进行迭代优化，以确定目标自愈动作，并利用所述目标自愈动作对所述网络切片进行故障修复。The dual-agent confrontation module 30 is configured to iteratively optimize the generated slice self-healing action and slice failure action according to the behavior score, so as to determine the target self-healing action, and use the target self-healing action to perform Bug fixes.

可选地，所述动作生成模块20，还用于：Optionally, the action generating module 20 is further configured to:

根据所述目标指标数据，利用所述目标自愈模型的动作生成器生成切片自愈动作和切片故障动作。According to the target index data, an action generator of the target self-healing model is used to generate a slice self-healing action and a slice fault action.

可选地，所述故双智能体对抗模块30，还用于：Optionally, the dual-agent confrontation module 30 is also used for:

将所述切片自愈动作的行为得分反馈至所述自愈动作生成器，并将所述切片故障动作的行为得分反馈至所述故障动作生成器，返回并执行所述利用所述目标自愈模型中的自愈动作生成器生成切片自愈动作，并利用所述目标自愈模型中的故障动作生成器生成切片故障动作的步骤，以对所述切片自愈动作和所述切片故障动作进行优化更新，直到所述切片自愈动作的行为得分和所述切片故障动作的行为得分满足预设条件，得到目标自愈动作；Feedback the behavior score of the slice self-healing action to the self-healing action generator, and feed back the behavior score of the slice fault action to the fault action generator, return and execute the self-healing action using the target The self-healing action generator in the model generates a slice self-healing action, and uses the fault action generator in the target self-healing model to generate a slice fault action, so as to perform the slice self-healing action and the slice fault action Optimizing and updating until the behavior score of the slice self-healing action and the behavior score of the slice failure action meet a preset condition, and obtain a target self-healing action;

将所述网络切片从当前的全信息状态切换至所述目标自愈动作对应的目标全信息状态，以对所述网络切片进行故障修复。Switching the network slice from the current full information state to the target full information state corresponding to the target self-healing action, so as to perform fault repair on the network slice.

可选地，所述网络切片故障自愈装置还包括自愈策略更新模块，用于：Optionally, the network slice fault self-healing device further includes a self-healing policy update module, configured to:

可选地，所述网络切片故障自愈装置还包括模型训练模块，用于：Optionally, the network slicing fault self-healing device also includes a model training module for:

此外，本发明实施例还提出一种计算机程序产品，所述计算机程序产品包括计算机程序，所述计算机程序被处理器执行时实现上述实施例提供的网络切片故障自愈方法中的操作。In addition, an embodiment of the present invention also proposes a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, operations in the network slice fault self-healing method provided in the foregoing embodiments are implemented.

本发明设备和计算机程序产品各实施例，均可参照本发明网络切片故障自愈方法各个实施例，此处不再赘述。The various embodiments of the device and the computer program product of the present invention can refer to the various embodiments of the network slicing fault self-healing method of the present invention, which will not be repeated here.

需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体/操作/对象与另一个实体/操作/对象区分开来，而不一定要求或者暗示这些实体/操作/对象之间存在任何这种实际的关系或者顺序；术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second etc. are only used to distinguish one entity/operation/object from another entity/operation/object and do not necessarily require or imply these the existence of any such actual relationship or order between entities/operations/objects; the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, An article or system includes not only those elements, but also other elements not expressly listed, or elements inherent in such a process, method, article, or system. Without further limitations, an element defined by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article or system comprising that element.

对于装置实施例而言，由于其基本相似于方法实施例，所以描述得比较简单，相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的，其中作为分离部件说明的单元可以是或者也可以不是物理上分开的。可以根据实际的需要选择中的部分或者全部模块来实现本发明方案的目的。本领域普通技术人员在不付出创造性劳动的情况下，即可以理解并实施。As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment. The apparatus embodiments described above are merely illustrative, where units illustrated as separate components may or may not be physically separate. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present invention. It can be understood and implemented by those skilled in the art without creative effort.

上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。The serial numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本发明各个实施例所述的网络切片故障自愈方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present invention can be embodied in the form of a software product in essence or in other words, the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , magnetic disk, optical disk), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the network slicing fault self-healing method described in various embodiments of the present invention .

以上仅为本发明的优选实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the patent scope of the present invention. Any equivalent structure or equivalent process conversion made by using the description of the present invention and the contents of the accompanying drawings, or directly or indirectly used in other related technical fields , are all included in the scope of patent protection of the present invention in the same way.

Claims

1. The network slice fault self-healing method is characterized by comprising the following steps of:

acquiring multi-dimensional index data of the network slice, and detecting the acquired multi-dimensional index data _；

When abnormal data are detected to exist in the multi-dimensional index data of the network slice, generating a slice self-healing action and a slice fault action according to the multi-dimensional index data, and judging the generated slice self-healing action and slice fault action to determine the action scores of the slice self-healing action and the slice fault action;

the step of generating a slice self-healing action and a slice fault action according to the multidimensional index data comprises the following steps:

inputting the multi-dimensional index data into a preset target self-healing model, wherein the target self-healing model is obtained by performing iterative training on the preset self-healing model to be trained by utilizing the historical multi-dimensional index data of a network slice, and the target self-healing model comprises an action generator;

generating a slicing self-healing action and a slicing fault action by using an action generator of the target self-healing model according to the multidimensional index data;

and performing iterative optimization on the generated slice self-healing action and slice fault action according to the behavior score to determine a target self-healing action, and performing fault repair on the network slice by using the target self-healing action.

2. The network slice fault self-healing method of claim 1, wherein the action generator comprises a self-healing action generator and a fault action generator, the step of generating slice self-healing actions and slice fault actions using the action generator of the target self-healing model according to the multi-dimensional index data comprising:

determining a current full information state of the network slice according to the multi-dimensional index data, wherein the full information state comprises a first state observable by the self-healing action generator and a second state observable by the fault action generator;

inputting a first state in the full information states into the self-healing action generator, and generating a slicing self-healing action by using the self-healing action generator;

and inputting a second state in the all-information states into the fault action generator, and generating a slicing fault action by using the fault action generator.

3. The network slice fault self-healing method of claim 2, wherein the target self-healing model further comprises a self-healing action evaluator and a fault action evaluator, the step of evaluating the generated slice self-healing action and slice fault action to determine a behavioral score of the slice self-healing action and the slice fault action comprising:

Inputting a first state in the full information state, the slice self-healing action and the slice fault action into a self-healing action judging device in the target self-healing model so as to judge the slice self-healing action and determine a behavior score of the slice self-healing action;

determining a fault destruction radius according to the slice fault action;

and inputting a second state in the full information state, the fault destruction radius, the slice self-healing action and the slice fault action into a fault action judging device in the target self-healing model so as to judge the slice fault action and determine a behavior score of the slice fault action.

4. The network slice fault self-healing method of claim 3, wherein the step of iteratively optimizing the generated slice self-healing actions and slice fault actions according to the behavioral scores to determine a target self-healing action and utilizing the target self-healing action to perform fault remediation on the network slice comprises:

feeding back the behavior score of the slice self-healing action to the self-healing action generator, feeding back the behavior score of the slice fault action to the fault action generator, returning to and executing the step of generating the slice self-healing action by using the self-healing action generator in the target self-healing model and generating the slice fault action by using the fault action generator in the target self-healing model, so as to perform iterative optimization on the slice self-healing action and the slice fault action until the behavior score of the slice self-healing action and the behavior score of the slice fault action meet preset conditions, and obtaining the target self-healing action;

And switching the state of the network slice according to the target self-healing action so as to switch the network slice from the current full-information state to the target full-information state corresponding to the target self-healing action, and repairing the fault of the network slice.

5. The network slice fault self-healing method of claim 4, wherein the step of iteratively optimizing the generated slice self-healing actions and slice fault actions according to the behavioral scores to determine a target self-healing action and utilizing the target self-healing action to perform fault remediation on the network slice comprises:

inputting the target self-healing action into a preset rewarding function to obtain a rewarding value corresponding to the target self-healing action;

and generating an experience playback data set according to the reward value, and updating model parameters of the target self-healing model according to the experience playback data set.

6. The network slice fault self-healing method of claim 1, wherein prior to the step of generating slice self-healing actions and slice fault actions from the multi-dimensional index data, further comprising:

acquiring historical index data of a network slice and preprocessing the historical index data to obtain sample data;

Obtaining model architecture parameters, and establishing a basic self-healing model according to the model architecture parameters;

and carrying out iterative training on the basic self-healing model by using the sample data to obtain a target self-healing model.

7. A network slice fault self-healing device, characterized in that the network slice fault self-healing device comprises:

the data detection module is used for acquiring the multi-dimensional index data of the network slice and detecting the acquired multi-dimensional index data;

the action generating module is used for generating a slice self-healing action and a slice fault action according to the multi-dimensional index data when abnormal data exist in the multi-dimensional index data of the network slice, and judging the generated slice self-healing action and slice fault action to determine the action scores of the slice self-healing action and the slice fault action; inputting the multi-dimensional index data into a preset target self-healing model, wherein the target self-healing model is obtained by performing iterative training on the preset self-healing model to be trained by utilizing the historical multi-dimensional index data of a network slice, and the target self-healing model comprises an action generator; generating a slicing self-healing action and a slicing fault action by using an action generator of the target self-healing model according to the multidimensional index data;

And the dual-agent countermeasure module is used for carrying out iterative optimization on the generated slice self-healing action and slice fault action according to the behavior score so as to determine a target self-healing action and carrying out fault repair on the network slice by utilizing the target self-healing action.

8. A network slice fault self-healing device, the network slice fault self-healing device comprising: memory, a processor and a network slice fault self-healing program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the network slice fault self-healing method according to any one of claims 1 to 6.

9. A storage medium having stored thereon a computer program which, when executed by a processor, implements the network slice fault self-healing method according to any one of claims 1-6.