WO2022057321A1

WO2022057321A1 - Method and apparatus for detecting anomalous link, and storage medium

Info

Publication number: WO2022057321A1
Application number: PCT/CN2021/098011
Authority: WO
Inventors: 苏婵菲; 文勇; 刘宝华; 潘璐伽
Original assignee: 华为技术有限公司
Priority date: 2020-09-17
Filing date: 2021-06-02
Publication date: 2022-03-24
Also published as: CN114205245A

Abstract

Provided is a method for detecting an anomalous link, the method comprising: receiving network data of at least one network node in a communication link; acquiring a network feature corresponding to the network data; and inputting the network feature into a first model, so as to obtain a detection result regarding whether the communication link is an anomalous link, wherein the first model is a model which is obtained by means of a second module obtained in previous training being trained according to labeled samples, K labeled samples and M unlabeled samples in a first sample set until the training meets a pre-set condition; the K labeled samples are obtained by means of respectively labeling K unlabeled samples in the first sample set; the M unlabeled samples are unlabeled samples which are selected from the first sample set to serve as negative samples; and the first sample set comprises pre-stored labeled samples and unlabeled samples before the K unlabeled samples and the M unlabeled samples are selected. By using the embodiments of the present application, the accuracy of the detection of anomalous links is increased.

Description

Abnormal link detection method, device and storage medium

This application claims the priority of the Chinese patent application with the application number 202010981945.1 and the application name "Abnormal link detection method, device and storage medium", which was submitted to the State Intellectual Property Office of China on September 17, 2020, the entire contents of which are by reference Incorporated in this application.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to an abnormal link detection method, device and storage medium.

Background technique

With the rapid development of telecommunication networks and the increasingly diverse demands of users, network communication enterprises need to deal with large-scale communication data and more complex network operation and maintenance work. If the abnormality that occurs on the network device cannot be detected and handled in time, the user cannot communicate normally, thereby affecting the user experience.

In practical applications, an abnormal link detection model can be used to detect a communication link to obtain a detection result of whether the communication link is an abnormal link. However, the abnormal link detection model is usually a classifier obtained by training a large number of labeled samples, and each labeled sample is obtained by manual labeling, which requires a lot of manpower, and there are some errors in manual labeling. How to improve the detection accuracy of abnormal links by using the existing marked samples is a technical problem to be solved by those skilled in the art.

SUMMARY OF THE INVENTION

The embodiment of the present application discloses an abnormal link detection method, device and storage medium, which can perform abnormal link detection on a communication link by using samples selected from unmarked samples and an abnormal link detection model trained from existing marked samples. detection, which improves the accuracy of detecting abnormal links.

In a first aspect, an embodiment of the present application discloses a method for detecting an abnormal link, including: receiving network data of at least one network node in a communication link; acquiring network features corresponding to the network data; inputting the network features into a first model to obtain The detection result of the communication link. The detection result is used to indicate whether the communication link is an abnormal link. The first model is based on the marked samples, K marked samples and M unmarked samples in the first sample set. The second model obtained by one training is trained, and the model obtained when the training meets the preset conditions, the K labeled samples are obtained by labeling the K unlabeled samples in the first sample set respectively, and the M unlabeled samples are obtained. is an unlabeled sample selected as a negative sample from the first sample set, and the first sample set includes pre-stored labeled samples and unlabeled samples before selecting K unlabeled samples and M unlabeled samples. In this way, after receiving the network data of the network nodes in the communication link, the network characteristics of the network data are obtained, and then the network characteristics are input to the abnormality obtained by training the samples selected from the unlabeled samples and the existing labeled samples. The link detection model is used to detect the communication link, thereby improving the accuracy of detecting abnormal links.

In a possible example, before inputting the network features into the first model, the method further includes: acquiring an anomaly score value of each unlabeled sample in the first sample set; The abnormal score values of the samples are sorted in descending order to obtain the first ranking; the unlabeled samples corresponding to the first K serial numbers in the first ranking are regarded as the K unlabeled samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, the method further includes: taking the unlabeled samples corresponding to the last L serial numbers in the first sorting as L unlabeled samples; and selecting M unlabeled samples from the L unlabeled samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, before the network features are input into the first model, the method further includes: acquiring network topology information of the communication link; storing pre-stored unlabeled samples and labeled samples corresponding to the network topology information The composed set is taken as the first sample set. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

In a second aspect, an embodiment of the present application discloses a model training method, comprising: selecting K unlabeled samples from a first sample set; selecting M unlabeled samples as negative samples from the first sample set. This set includes pre-stored labeled samples and unlabeled samples before selecting K unlabeled samples and M unlabeled samples; according to the labeled samples, K labeled samples and M unlabeled samples in the first sample set , the second model obtained from the previous training is trained, and the first model is obtained when the training meets the preset conditions, and the K marked samples are obtained by marking the K unmarked samples respectively. In this way, by selecting samples from unlabeled samples, the model can learn the distribution of positive and negative samples in unlabeled samples during training. In addition, retraining the model obtained from the previous training according to the selected samples and the existing labeled samples can further improve the detection accuracy.

In a possible example, selecting K unlabeled samples from the first sample set includes: obtaining an anomaly score value of each unlabeled sample in the first sample set; The abnormal score values are sorted in descending order to obtain the first ranking; the unlabeled samples corresponding to the first K serial numbers in the first ranking are regarded as the K unlabeled samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, selecting M unlabeled samples as negative samples from the first sample set includes: taking the unlabeled samples corresponding to the last L serial numbers in the first sorting as L unlabeled samples; M unlabeled samples are selected from the unlabeled samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, selecting M unlabeled samples as negative samples from the first sample set includes: counting the labeled samples in the first sample set and the number of positive samples in the K labeled samples; , select M unlabeled samples as negative samples from the first sample set, and M is equal to the number of positive samples. That is to say, the number of new negative samples in the sample set for training the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, and improve the effect of model training. , which is convenient to improve the accuracy of detecting abnormal links. In a possible example, before selecting K unmarked samples from the first sample set, the method further includes: acquiring network topology information of the communication link to be detected; storing pre-stored information corresponding to the network topology information The set of unlabeled samples and labeled samples is used as the first sample set. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

With reference to the first aspect, the second aspect or any one possible example, in a possible example, before acquiring the abnormal score value of each unlabeled sample in the first sample set, the method further includes: acquiring a second The abnormal score value of each unlabeled sample in the sample set, the second sample set includes the pre-stored labeled samples and unlabeled samples before selecting P unlabeled samples; according to the abnormal score value of each unlabeled sample in the second sample set Arrange in descending order to obtain a second order; take the unlabeled samples corresponding to the first P serial numbers in the second order as P unlabeled samples; build a third model according to the labeled samples and P labeled samples in the second sample set, P The marked samples are obtained by marking the P unmarked samples respectively, and the third model is an initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate.

In a third aspect, an embodiment of the present application discloses an abnormal link detection device, comprising: a communication unit for receiving network data of at least one network node in a communication link; a processing unit for acquiring network characteristics corresponding to the network data; The network feature is input into the first model to obtain the detection result of the communication link, and the detection result is used to indicate whether the communication link is an abnormal link. The first model is based on the marked samples in the first sample set, K marked samples and M unlabeled samples, the second model obtained from the previous training is trained, and the model obtained when the training meets the preset conditions, the K labeled samples are the K unlabeled samples in the first sample set are labeled respectively It is obtained that the M unlabeled samples are unlabeled samples selected as negative samples from the first sample set, and the first sample set includes the pre-stored pre-stored samples before K unlabeled samples and M unlabeled samples are selected. Labeled and unlabeled samples. In this way, after receiving the network data of the network nodes in the communication link, the network characteristics of the network data are obtained, and then the network characteristics are input to the abnormality obtained by training the samples selected from the unlabeled samples and the existing labeled samples. The link detection model is used to detect the communication link, thereby improving the accuracy of detecting abnormal links.

In a possible example, the processing unit is further configured to obtain the abnormal score value of each unlabeled sample in the first sample set; and perform descending sorting according to the abnormal score value of each unlabeled sample in the first sample set to obtain the first Sorting; take the unlabeled samples corresponding to the first K serial numbers in the first sorting as the K unlabeled samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, the processing unit is further configured to use the unlabeled samples corresponding to the last L serial numbers in the first sorting as the L unlabeled samples; and select M unlabeled samples from the L unlabeled samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, the processing unit is further configured to obtain an abnormal score value of each unlabeled sample in the second sample set, where the second sample set includes pre-stored labeled samples and unlabeled samples before selecting P unlabeled samples sample; perform a descending arrangement according to the abnormal score value of each unlabeled sample in the second sample set to obtain a second ranking; take the unlabeled samples corresponding to the first P serial numbers in the second ranking as P unlabeled samples; according to the second sample set The marked samples and the P marked samples are constructed to construct a third model, the P marked samples are obtained by marking the P unmarked samples respectively, and the third model is the initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate.

In a possible example, the processing unit is further configured to acquire network topology information of the communication link; a set composed of pre-stored unlabeled samples and labeled samples corresponding to the network topology information and device information is used as the first sample set. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

In a fourth aspect, an embodiment of the present application discloses a model training device, comprising: a selection module for selecting K unlabeled samples from a first sample set; and selecting M samples from the first sample set as negative samples Unlabeled samples, the first sample set includes pre-stored labeled samples and unlabeled samples before selecting K unlabeled samples and M unlabeled samples; a training module, used for according to the labeled samples in the first sample set , K labeled samples and M unlabeled samples, train the second model obtained from the previous training, and obtain the first model when the training meets the preset conditions, and the K labeled samples are for the first sample set. K unlabeled samples are obtained by labeling them respectively. In this way, by selecting samples from unlabeled samples, the model can learn the distribution of positive and negative samples in unlabeled samples during training. In addition, retraining the model obtained from the previous training according to the selected samples and the existing labeled samples can further improve the detection accuracy.

In a possible example, the selection module is specifically configured to obtain the abnormal score value of each unlabeled sample in the first sample set; according to the abnormal score value of each unlabeled sample in the first sample set, the abnormal score value of each unlabeled sample is sorted in descending order to obtain the first Sorting; take the unlabeled samples corresponding to the first K serial numbers in the first sorting as the K unlabeled samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, the selection module is specifically configured to use the unmarked samples corresponding to the last L serial numbers in the first sorting as the L unmarked samples; and select M unmarked samples from the L unmarked samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, the selection module is specifically configured to count the marked samples in the first sample set and the number of positive samples in the K marked samples; according to the number of positive samples, select M unmarked samples from the first sample set The labeled samples are taken as negative samples, and M is equal to the labeled samples in the first sample set and the number of positive samples in the K labeled samples. That is to say, the number of new negative samples in the sample set for training the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, and improve the effect of model training. , which is convenient to improve the accuracy of detecting abnormal links.

In a possible example, the selection module is further configured to obtain the abnormal score value of each unlabeled sample in the second sample set, where the second sample set includes pre-stored labeled samples and unlabeled samples before selecting P unlabeled samples sample; perform a descending arrangement according to the abnormal score value of each unlabeled sample in the second sample set to obtain a second ranking; take the unlabeled samples corresponding to the first P serial numbers in the second ranking as P unlabeled samples; according to the second sample set The marked samples and the P marked samples are constructed to construct a third model, the P marked samples are obtained by marking the P unmarked samples respectively, and the third model is the initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate.

In a possible example, the selection module is further configured to acquire the network topology information of the communication link to be detected; the pre-stored set of unmarked samples and marked samples corresponding to the network topology information is used as the first sample set. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

In combination with the first aspect, the third aspect, or any possible example, in a possible example, M is equal to the number of labeled samples in the first sample set and the number of positive samples in the K labeled samples. In this way, the number of new negative samples in the sample set for training the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, improve the effect of model training, and facilitate Improve the accuracy of detecting abnormal links.

With reference to the first aspect, the second aspect, the third aspect, the fourth aspect or any possible example, in a possible example, the network data includes at least one of the following: a signal-to-noise ratio, a level of an input signal, Errored seconds, severely errored seconds, unavailable time, network topology information. In this way, abnormal link detection is performed through different network data, which can improve the diversity of detection.

In a fifth aspect, an embodiment of the present application provides another device, comprising a processor, a memory connected to the processor, and a communication interface, where the memory is used to store one or more programs and is configured to be executed by the processor in any of the foregoing aspects step, the device includes an abnormal link detection device and a model training device.

In a sixth aspect, the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, which, when executed on a computer, cause the computer to execute the method of any one of the foregoing aspects.

In a seventh aspect, the present application provides a computer program product, where the computer program product is used to store a computer program, and when the computer program is run on a computer, causes the computer to execute the method of any one of the above-mentioned aspects.

In an eighth aspect, the present application provides a chip, including a processor and a memory, where the processor is configured to call and execute instructions stored in the memory from the memory, so that a device equipped with the chip executes the method of any one of the foregoing aspects.

In a ninth aspect, the present application provides another chip, comprising: an input interface, an output interface and a processing circuit, the input interface, the output interface and the processing circuit are connected through an internal connection path, and the processing circuit is used to perform any one of the above-mentioned aspects. method.

In a tenth aspect, the present application provides another chip, including: an input interface, an output interface, a processor, and optionally a memory, the input interface, the output interface, the processor, and the memory are connected through an internal connection path, The processor is used to execute code in the memory, and when the code is executed, the processor is used to perform the method of any of the above aspects.

In an eleventh aspect, an embodiment of the present application provides a chip system, including at least one processor, a memory and an interface circuit, the memory, the transceiver and the at least one processor are interconnected through lines, and at least one memory stores a computer program; the computer program A method of any of the above aspects is performed by a processor.

Description of drawings

1 is a schematic flowchart of a model training method provided by an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a communication system provided by an embodiment of the present application;

3 is a schematic flowchart of a detection node provided by an embodiment of the present application;

FIG. 4 is a schematic flowchart of an abnormal link detection method provided by an embodiment of the present application;

5 is a schematic structural diagram of a model training device provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of an abnormal link detection apparatus provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a device provided by an embodiment of the present application.

detailed description

When describing the embodiments of the present application, some concepts used in the following description are first explained.

(1) Normal link and abnormal link.

There is no abnormal situation in the communication process of the normal link. The abnormal link is the opposite of the normal link. An abnormal situation occurs during the communication process of the abnormal link. Abnormal conditions include a network node in the communication link disconnecting from other network nodes, or it may be that the network node does not receive a pre-received signal (or information), or it may be that the network node does not send the signal to be transmitted to the network node. The network node to be received may be at least one of the situations in which the network node sends the signal to be transmitted to the network node that should not be received, which is not limited herein.

(2) Binary classification, positive samples and negative samples.

Binary classification means that there are two categories in a classification task, for example, classifying a picture to determine whether the picture is a car or not. Positive samples include categories that need to be identified in binary classification tasks. Negative samples are the opposite of positive samples, which include categories that do not need to be identified in binary classification tasks. For example, to classify a picture to determine whether the image in the picture belongs to a car, the car is the type to be identified, the picture of the car can be used as a positive sample, and any picture that is not a car can be used as a negative sample. In the abnormal link detection scenario, the category of the abnormal link needs to be identified. Therefore, in this embodiment of the present application, positive samples correspond to samples of abnormal links, and negative samples correspond to samples of normal links.

(3) Outliers and non-outliers.

An outlier is a sample that is significantly different from the rest of the data. Non-outliers are the opposite of outliers, which are samples of the same type in the sample as the rest of the data. Since the quantity of normal data is much larger than the quantity of abnormal data, outliers can be understood as abnormal data, and non-outliers can be understood as normal data. That is to say, in the embodiments of the present application, outliers may be understood as positive samples, and non-outliers may be understood as negative samples.

(4) True positive (TP) samples, false negative (FN) samples, false positive (FP) samples, and true negative (TN) samples.

The real class sample is actually a positive sample, and the binary classification model predicts it as a positive sample. False negative samples are actually positive samples, but the binary classification model predicts them as negative samples. False positive samples are actually negative samples, but the binary model predicts them as positive samples. The true negative class sample is actually a negative sample, and the two-class model predicts a negative sample.

(5) An abnormal link detection model, a first model and a second model.

In this embodiment of the present application, the abnormal link detection model is used to detect whether the communication link is an abnormal link. It should be noted that this application refers to the initialization model of the abnormal link detection model as the third model, and the abnormal link detection model obtained from the previous training is called the second model, and the second model will be trained, and the training will be completed after the training is completed. The abnormal link detection model obtained when , is called the first model. When the second model is not the third model, the training methods of the first model and the second model are the same. The initialization model refers to the model obtained when the abnormal link detection model is constructed, which can be understood as the model obtained by the first training, and the third model can also be understood as the initialization model of the first model and the second model. The parameters of the third model can be understood as the initialization parameters of the abnormal link detection model, and the construction of the initialization model can be understood as obtaining the initialization parameters of the abnormal link detection model. The parameters of the second model can be understood as the initialization parameters of the first model, and the training of the second model can be understood as updating the parameters of the second model, and it can also be understood as acquiring the initialization parameters of the first model.

The abnormal link detection model can be a neural network. The neural network can be composed of neural units. The neural unit can refer to an operation unit that takes x _s and an intercept 1 as inputs. The output of the operation unit can be:

Among them, s=1, 2, ... n, n is a natural number greater than 1, W _s is the weight of x _s , and b is the bias of the neural unit. f is an activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer. The activation function can be a sigmoid function. A neural network is a network formed by connecting many of the above single neural units together, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field, and the local receptive field can be an area composed of several neural units. If the abnormal link detection model is a neural network, the parameters of the abnormal link detection model can be understood as W _s and b in formula (1).

(6) Network data.

In this embodiment of the present application, the network data includes device information and performance data of the network node, and network topology information of the communication link corresponding to the network node, and the like, which is not limited herein. The network topology information of the communication link is used to describe the connection relationship between each network node in the communication link and the device information of each network node. The device information is used to describe hardware parameters of the network node, such as device model, voltage limit size, current limit size, storage capacity, transmission rate, and the like. The performance data of the network node may include but not limited to at least one of the following information: signal-to-noise ratio (signal-to-noise ratio, SNR or S/N), input signal level (signal level at receiver input, RSL) , the number of errored seconds (ES), the number of severely errored seconds (SES), the period of unavailability (UAS), skewness, etc.

Signal-to-noise ratio, also known as signal-to-noise ratio, refers to the ratio of signal to noise in a network node or a communication link. The noise here refers to an irregular additional signal (or information) that does not exist in the original signal generated by the network node, and the signal does not change with the change of the original signal. In this embodiment of the present application, SNR _max represents the maximum signal-to-noise ratio during the observation period, and SNR _min represents the minimum signal-to-noise ratio during the observation period. The observation period can also be called observation time or observation duration, etc. The observation period can be a preset fixed time period, which is the same for all communication links and all network nodes; it can also be each communication link and each network node. A different time set by the network node; it can also be a dynamic time determined by the gateway node used to manage the network node in the communication link, that is to say, the time is not a certain value, and the time can be determined by the gateway node It is determined according to channel quality conditions, network load conditions, etc., and is not limited here.

The level of an input signal refers to the logarithm of the ratio of power or voltage or current between two network nodes when a network node sends a signal to another network node. In the embodiment of the present application, RSL _max represents the maximum input signal level within the observation period.

Errored Seconds is used to describe the number of errors in a second. Conditions corresponding to severely errored seconds include that the bit error rate in any one second of observation period is greater than a threshold, or signal loss is detected. In this embodiment of the present application, ES _max represents the maximum errored seconds during the observation period, and SES _max represents the maximum severely errored seconds during the observation period.

The unavailability time starts when the network node generates 10 consecutive severely errored seconds, and reports it, and ends when the errored seconds per second within 10 consecutive seconds are not severely errored seconds. In this embodiment of the present application, UAS _max represents the maximum unavailable time during the observation period.

Skewness, also known as skewness and skewness coefficient, is a measure of the direction and degree of skewness in the distribution of statistical data, and is used to measure the asymmetry of the probability distribution of random variables. 0 means the most perfect symmetry, and the skewness of the normal distribution is 0. Please refer to formula (2) for the calculation of skewness.

Among them, S is the skewness, i is the ith value, n is the number of samples, μ is the mean, and σ is the standard deviation.

(7) Network features.

In this embodiment of the present application, the network feature is used to describe the performance feature corresponding to the network data. The present application does not limit the method for acquiring network features, and statistical analysis can be performed based on network data of different dimensions. When network data of a network node is received, the network characteristics of the network node can be determined according to the network data. Taking network node 1 as an example, the variance of SNR, SNR _max , SNR _min , skewness of RSL, ES _max , SES _max , and UAS _max of network node 1 during the observation period can be obtained.

When network data of multiple network nodes on a communication link is received, the network characteristics of each network node can be obtained separately, or the network characteristics of the communication link can be obtained by integrating the network data of each network node. Taking network node 1 and network node 2 as an example, network node 1 and network node 2 are both network nodes on a communication link L1, and the variance of the SNR of network node 1 and network node 2 during the observation period, SNR _max , SNR _min , RSL skewness, ES _max , SES _max , UAS max , etc.; the SNR _max between network node 1 and network node 2 can be obtained, or the variance of SNR _max , or the sum of squared differences of SNR _max ; Obtain the SNR _min between the network node 1 and the network node 2, or the variance of the SNR _min , or the sum of the squared differences of the SNR _min ; the RSL _max between the network node 1 and the network node 2, or the skewness of the RSL _max can be obtained; Obtain the ES _max between the network node 1 and the network node 2; obtain the SES _max between the network node 1 and the network node 2; obtain the UAS _max between the network node 1 and the network node 2, etc. And the network characteristics of the communication link can be described by a piece of data, the network characteristics of the communication link L1 x = {the variance of the SNR of the network node 1, the skewness of the RSL of the network node 1, the variance of the SNR of the network node 2, the network Skewness of RSL of node 2, sum of squared differences of SNR _max between network node 1 and network node 2, sum of squared differences of SNR _min between network node 1 and node 2, network node 1 and network node 2 ES _max between network node 1 and network node 2, SES _max between network node 1 and network node 2, UAS _max between network node 1 and network node 2.

The network features can also be obtained through the network embedding method (Network Embedding). Network embedding methods, aiming to learn low-dimensional latent representations of nodes in a network, and the learned feature representations can be used as features for various graph-based tasks, such as classification, clustering, link prediction, and visualization. The central idea is to find a mapping function that transforms each node in the network into a low-dimensional latent representation. Obtaining the network features of network nodes through the network embedding method can improve the accuracy and efficiency of obtaining features.

(8) Labeled samples, unlabeled samples, first sample set and second sample set.

In this embodiment of the present application, the data of the labeled samples includes a label, and the label is used to indicate whether the labeled sample is a positive sample or a negative sample. Data for unlabeled samples do not include labels. Labeled samples may also be referred to as labeled samples, or labeled data or labeled data, etc., and unlabeled samples may also be referred to as unlabeled data or untagged data, etc. In the embodiments of the present application, marked samples and unmarked samples are used as examples for illustration, positive samples in the marked samples correspond to samples of abnormal links, and negative samples in the marked samples correspond to samples of normal links. In addition, both the marked samples and the unmarked samples can include network data of the network node, and the network data can refer to the definition (6) above, which will not be repeated here.

In this embodiment of the present application, the first sample set includes labeled samples and unlabeled samples before samples for training the first model are selected. The second set of samples includes labeled samples and unlabeled samples before the samples selected for building the third model. When the samples selected in the first sample set are K unlabeled samples and M unlabeled samples, the first sample set can be understood as the marked samples and unlabeled samples before the K unlabeled samples and M unlabeled samples are selected. A collection of labeled samples. When the samples selected in the second sample set are P unlabeled samples, the second sample set can be understood as a set composed of labeled samples and unlabeled samples before the P unlabeled samples are selected.

This application does not limit the selection method of the first sample set and the second sample set, and all or part of the samples may be selected, and some samples may be samples obtained in a recent period, or may be the samples of the communication link to be detected. The historical samples, or the historical samples of the communication link of the same type as the communication link, are not limited here. Taking the first sample set as an example, the network topology information of the marked samples and the unmarked samples in the first sample set is consistent with the network topology information of the communication link to be detected. The communication link to be detected may be the communication link deployed by the abnormal link detection model. The samples in the first sample set may be historical samples of the communication link, or may be historical samples of a communication link of the same type as the communication link. It can be understood that when unlabeled samples and labeled samples corresponding to the same network topology information are selected as the first sample set, the effect of model training can be improved, and the accuracy of communication link detection can be improved.

(9) Evaluation index of abnormal link detection model.

In the embodiment of the present application, the evaluation index of the abnormal link detection model is used to evaluate the detection effect of the abnormal link detection model. The better the detection effect (or the effect) is, it can be understood that the abnormal link detection model has a larger value corresponding to the evaluation index for identifying the abnormal link. Evaluation indicators can include precision (precision, P), recall (recall, R), sensitivity (true positive rate, TPR), specificity (false positive rate, FPR), accuracy (accuracy), F1 value (F1- score), etc., which are not limited here.

Among them, the accuracy, also known as the precision, refers to the proportion of the number of positive samples that are correctly divided into all positive samples. Recall rate refers to the proportion of all positive samples that are correctly classified as positive samples. Sensitivity refers to the proportion of all positive samples that are correctly identified as positive samples. Specificity refers to the proportion of all negative samples that are misidentified as positive samples. Accuracy refers to the proportion of all samples that are correctly classified. The F1 value is also known as the harmonic mean. When the recall rate is larger, the prediction coverage will be higher and the precision will be smaller. Therefore, the F1 value can be used to reconcile the precision and recall rate. For the calculation of precision P, recall rate R, sensitivity TPR, specificity FPR, precision rate, and F1 value, please refer to formula (3), formula (4), formula (5), formula (6), formula (7) and formula (8).

P=TP/(TP+FP) (3)

R=TP/(TP+FN) (4)

TPR=TP/(TP+FN) (5)

FPR=FP/(FP+TN) (6)

acc=(TP+FN)/(TP+FN+FP+TN) (7)

F1=(2*P*R)/(P+R) (8)

In the above formula, acc represents the accuracy rate, TP represents the number of true samples, FP represents the number of false positive samples, FN represents the number of false negative samples, and TN represents the number of true negative samples. Recall and sensitivity are equal when the number of all positive samples is equal to the number of true class samples and false negative class samples.

The detection effect of the abnormal link detection model can also be evaluated by the precision recall (PR) curve corresponding to the precision and recall rate in the evaluation index, the receiver operating characteristic curve (ROC) corresponding to the specificity and sensitivity, The area under the ROC curve (ROC area under curve, ROC-AUC) and the area under the PR curve (PR area under curve, PR-AUC) were determined. Among them, the abscissa (x) of the PR curve is the recall rate, and the ordinate (y) is the precision. The abscissa (x) of the ROC curve is the specificity, and the ordinate (y) is the sensitivity. The value of ROC-AUC is the area enclosed by the ROC curve and the abscissa and ordinate. The value of PR-AUC is the area enclosed by the PR curve and the abscissa and ordinate. The closer the ROC curve is to the upper left corner, the greater the value of AUC. The larger the value of AUC, the closer the precision and recall are to 1. The closer the precision and recall are to 1, the better the detection performance of the model.

(10) Preset conditions.

In the embodiment of the present application, the preset condition is used to determine whether the training of the abnormal link detection model is completed, and is specifically used to determine that the evaluation index of the abnormal link detection model reaches or exceeds the threshold, or the evaluation index of the abnormal link detection model is difficult to It is determined that the training of the abnormal link detection model is completed when the number of trainings reaches or exceeds the threshold, etc. If the abnormal link detection model obtained in the last training is the second model, and the second model is trained, the preset conditions that are satisfied when the second model training is completed may include, but are not limited to, at least one of the following information: The precision of the model is greater than or equal to the first threshold; the recall rate of the second model is greater than or equal to the second threshold; the improvement of the precision of the second model is less than or equal to the third threshold; the improvement of the recall of the second model is less than or equal to the fourth threshold; the number of training times of the second model is greater than or equal to the fifth threshold; the accuracy of the second model is greater than or equal to the sixth threshold; the improvement of the accuracy of the second model is less than or equal to the seventh threshold; the second The harmonic mean F1 value corresponding to the precision and recall rate of the model is greater than or equal to the eighth threshold, etc. The above thresholds are not limited, and the third threshold may be equal to the fourth threshold. In order to improve the training effect, the threshold of this training may be equal to or greater than the threshold of the previous training.

(11) Unsupervised learning and supervised learning.

Unsupervised learning solves problems in pattern recognition based on unlabeled samples. Commonly used unsupervised learning algorithms include matrix factorization algorithm, solitary forest algorithm (isolation forest), principal component analysis (PCA), isometric mapping method, local linear embedding method, Laplace feature mapping method, Hesse's local linear embedding method and local tangent space arrangement method, etc. A typical example of unsupervised learning is clustering, where the purpose of clustering is to group similar things together without caring what the class is.

Supervised learning is the process of using labeled samples to adjust the parameters of a classifier to achieve the required performance, also known as supervised training or learning with a teacher. Common supervised learning algorithms: regression analysis and statistical classification. The most typical algorithms are k-Nearest Neighbor (KNN) and support vector machine (SVM).

The method provided by the present application will be described below from the model training side and the model application side.

The training method of the abnormal link detection model provided by the embodiment of the present application involves artificial intelligence technology, and can be specifically applied to data processing methods such as data training, machine learning, and deep learning. The network data of the network node) is symbolized and formalized for intelligent information modeling, extraction, preprocessing, training, etc., and finally a trained abnormal link detection model (such as the first model in the embodiment of the present application, the first model, the third Two models); and, the abnormal link detection method provided by the embodiment of the present application may use the above-mentioned trained abnormal link detection model (such as the first model in the embodiment of the present application), and input data (such as the embodiment of the present application) The network features in the abnormal link detection model) are input into the abnormal link detection model, and output data (such as the detection result of the communication link in the embodiment of the present application) are obtained. It should be noted that the training method of the abnormal link detection model and the abnormal link detection method provided by the embodiments of the present application are inventions based on the same concept, and can also be understood as two parts in a system, or an overall process two stages: such as model training stage and model application stage.

The model training phase includes a model initialization phase and a model training phase. Wherein, the model training stage is used to train the previously obtained model (such as the first model in the embodiment of the present application). The model initialization stage is used to build a model (such as the third model in the embodiment of the present application). This application does not limit the initialization method of the abnormal link detection model, and a supervised learning method can be used based on the marked samples (such as the embodiment of the present application). The labeled samples in the second sample set in the second sample set) to construct the initialization model of the abnormal link detection model; or the unsupervised learning method can be used first, and the unlabeled samples (such as the unlabeled samples in the second sample set in the embodiment of the present application) ) to classify to obtain unmarked samples of abnormal links and unmarked samples of normal links, and then manually mark the unmarked samples of abnormal links, and the marked samples (such as those in the second sample set in the embodiment of the present application) labeled samples) together to build the initialization model of the abnormal link detection model, etc.

In a possible example, the initialization method of the abnormal link detection model includes the following steps A1-A3, wherein:

A1: Obtain the abnormal score value of each unlabeled sample in the second sample set.

In the embodiment of the present application, the third model is an initialization model of the abnormal link detection model. The second sample set includes pre-stored labeled samples and unlabeled samples before the samples used to construct the third model are selected. When the samples selected in the second sample set are P unlabeled samples, the second sample set can be understood as The set of labeled samples and unlabeled samples before selecting P unlabeled samples.

This application does not limit the selection method of the second sample set, all or part of the samples can be selected, and part of the samples can be samples obtained in a recent period, or can be historical samples of the communication link to be detected, or can be The history samples of the communication link of the same type as the communication link are not limited here. In a possible example, the network topology information of the marked samples and the unmarked samples in the second sample set is consistent with the network topology information of the communication link to be detected. The communication link to be detected may be the communication link deployed by the abnormal link detection model. The samples in the second sample set may be historical samples of the communication link, or may be historical samples of a communication link of the same type as the communication link. It can be understood that when unmarked samples and marked samples corresponding to the same network topology information are selected as the second sample set, the accuracy of detecting whether the communication link is an abnormal link can be improved.

The abnormal score value is used to describe the abnormal possibility of the communication link corresponding to the unlabeled sample, which can be described by probability. This application does not limit the method for obtaining the abnormal score value, which can be obtained based on an unsupervised learning method; or select a most abnormal marked sample as a reference sample, and compare each unmarked sample in the second sample set with the reference sample. By comparison, the similarity value between each sample is obtained, and the similarity value is regarded as an abnormal score value, etc.

This application does not limit the execution conditions of step A1, which may be executed after the abnormal link detection model is deployed on the detection node, or after the number of stored unlabeled samples exceeds a threshold, or it may be executed after the distance Executed after the time of the first unmarked sample received exceeds a threshold, and the above threshold is not limited.

A2: According to the abnormal score value of each unlabeled sample in the second sample set, select P unlabeled samples from the second sample set.

This application does not limit P, and P is a positive integer, which can be set according to the number of unlabeled samples and/or the number of positive samples in the labeled samples and/or the number of negative samples in the labeled samples, etc., or can be set according to the abnormal link The evaluation indicators that are preset by the detection model are set.

In the embodiment of the present application, the abnormal score value of any unlabeled sample in the P unlabeled samples is greater than or equal to the abnormal score value of any unlabeled sample except the P unlabeled samples in the second sample set. The present application does not limit the method for selecting the P unlabeled samples, and the abnormal score values of each unlabeled sample in the second sample set may be sorted in descending order or ascending order. When the descending order is used as the second order, the unlabeled samples corresponding to the first P serial numbers in the second order can be obtained. In ascending order, the unlabeled samples corresponding to the last P serial numbers can be obtained. The method of selecting P unlabeled samples can also randomly select P reference unlabeled samples, and then compare the abnormal score values of the P reference unlabeled samples from the remaining unlabeled samples one by one, so as to obtain the P reference unlabeled samples. The smaller unlabeled samples among the unlabeled samples are replaced. It should be noted that the P unlabeled samples may include unlabeled samples with equal abnormal score values, and the unlabeled samples other than the P unlabeled samples in the second sample set may also be different from the unlabeled samples in the P unlabeled samples. The anomaly score values for the labeled samples are equal. The P unlabeled samples can be understood as the most abnormal part of the unlabeled samples in the second sample set.

A3: Build a third model according to the labeled samples and the P labeled samples in the second sample set.

Among them, the P marked samples are obtained by marking the P unmarked samples respectively. This application does not limit the labeling method of the P unlabeled samples, and the P unlabeled samples can be manually labeled, or directly used as positive samples.

This application also does not limit the method of constructing the third model. Logistic regression or decision tree algorithm can be used to classify the labeled samples in the second sample set, the network data in the P labeled samples, and the labels of the labeled samples. Thus, the parameters of the abnormal link detection model (ie, the third model) are obtained. In simple terms, the abnormal link detection model is equivalent to a function, and the network data (or the characteristic data corresponding to the network data) of each marked sample is a constant, which can be obtained by multiplying the constant and the parameters of the abnormal link detection model. The label of the labeled sample, the parameters of the abnormal link detection model can be obtained according to the labeled sample in the second sample set and the network data and label of each labeled sample in the P labeled samples. Further, according to gradient descent method (Gradient descent), Newton's method (Newton's method), conjugate gradient method (Conjugate gradient), Quasi-Newton method (Quasi-Newton method), heuristic method (for example, simulated annealing method, genetic method) algorithm, ant colony algorithm, particle swarm algorithm, etc.), adjust the parameters obtained by classification, and then adjust the parameters obtained last time according to the above methods, until the marked samples and P marked samples in the second sample set are determined. Sample, when the initialization training of the abnormal link detection model is completed, the parameters of the abnormal link detection model when the training is completed are used as the initialization parameters of the abnormal link detection model (ie, the parameters of the third model).

It can be understood that, in steps A1-A3, P unlabeled samples are selected from the second sample set as new training data, and the P unlabeled samples are not randomly selected, but are based on the data of each unlabeled sample in the second sample set. The most abnormal data is selected by the abnormal score value, which can reduce the workload of invalid labeling. The initialization model (ie, the third model) of the abnormal link detection model is constructed by using the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the accuracy of the model to detect abnormal links .

After the initialization model of the abnormal link detection model is obtained, the model training phase can be entered, and the method used in each training process is the same. This application does not limit the execution conditions of model training, which may be triggered after receiving network data of network nodes in the communication link, or the number of unlabeled samples that have been received or stored in advance exceeds a threshold, or It means that the time from the last model training exceeds a threshold, or it may be that the network data sent by the network node in the communication link has not been received for a long time, and the above threshold and time length are not limited.

This application does not limit the training method of the abnormal link detection model. The abnormal link detection model obtained from the previous training can be trained based on the newly added marked samples; or the unsupervised learning method can be used first to determine the sample set Unmarked samples of abnormal links and unmarked samples of normal links in the unmarked samples of the or the abnormal link detection model obtained from the previous training, select the most abnormal unlabeled sample, mark the unlabeled sample, and train it together with the marked sample, etc.

Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of a model training method proposed by an embodiment of the present application. As shown in FIG. 1 , the method can be executed by an abnormal link detection model or an abnormal link detection device or a detection node or terminal and other equipment, and the method includes:

S102: Select K unlabeled samples from the first sample set.

This application does not limit K, and K is a positive integer. Reference can be made to the description of P, and details are not repeated here. Optionally, K and P are equal. That is to say, the number of the most abnormal unlabeled samples selected by the abnormal link detection model is equal in the model initialization phase and the model training phase. It can be understood that no matter what the value of K is, new unlabeled samples are selected, and the abnormal link detection model is trained based on the new unlabeled samples, which can realize incremental learning, improve the effect of model training, and facilitate the detection of abnormal links. 's accuracy.

The present application does not limit the method for selecting K unlabeled samples, which may be randomly selected, or the most abnormal K unlabeled samples may be selected. It can be understood that randomly selecting K unlabeled samples in the sample set for training allows the abnormal link detection model to learn the distribution of positive and negative samples in the unlabeled samples during the training process. Since abnormal data is less than normal data, random selection may result in no or few positive samples.

In a possible example, step S102 includes the following steps B1 and B2, wherein:

B1: Obtain the anomaly score value of each unlabeled sample in the first sample set.

The method for obtaining the abnormal score value may refer to the description of A1, and may also be obtained based on the abnormality detection model (ie, the second model) obtained in the previous training, etc., which is not limited here. Obtaining the abnormal score value of the unlabeled sample through the abnormality detection model obtained by the previous training can improve the efficiency and accuracy of obtaining the abnormal score value.

B2: According to the abnormal score value of each unlabeled sample in the first sample set, select K unlabeled samples from the first sample set.

In the embodiment of the present application, the abnormal score value of any unlabeled sample in the K unlabeled samples is greater than or equal to the abnormal score value of any unlabeled sample except the K unlabeled samples in the first sample set. For the acquisition method of the K unlabeled samples, reference may be made to the description of A2, which will not be repeated here.

It can be understood that in step B1 and step B2, the samples to be marked selected from the unmarked samples are the most abnormal K unmarked samples in the first sample set. That is to say, the sample set of the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the improvement of the accuracy of the model for detecting abnormal links.

S104: Select M unlabeled samples from the first sample set as negative samples.

This application does not limit M, and M is a positive integer. Reference can be made to the description of P, and details are not repeated here. In the embodiment of the present application, the M unlabeled samples are unlabeled samples selected from the first sample set as negative samples, that is, the M unlabeled samples are regarded as normal link data.

The present application does not limit the method for selecting M unlabeled samples, and the most normal M unlabeled samples may be randomly selected. It can be understood that randomly selecting M unlabeled samples as negative samples in the sample set allows the abnormal link detection model to learn the distribution of positive and negative samples in the unlabeled samples during the training process. It should be noted that the M unlabeled samples should be different from the K unlabeled samples.

In a possible example, step S104 includes the following two ways, wherein:

The first method is to count the number of labeled samples in the first sample set and the number of positive samples in the K labeled samples; according to the number of positive samples, M unlabeled samples are selected from the first sample set as negative samples, where M is equal to the number of positive samples.

Wherein, the labeled samples in the first sample set and the number of positive samples in the K labeled samples can be understood as the number of samples of abnormal links in the sample set of the second model. That is to say, first count the number of samples of abnormal links in the sample set trained by the abnormal link detection model, and then select the most normal unlabeled samples from the first sample set, and the number of selected unlabeled samples is equal to the statistical The number of samples of anomalous links. In this way, the number of new negative samples in the sample set trained by the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, improve the effect of model training, and facilitate Improve the accuracy of the model to detect abnormal links.

In the second way, step S104 includes the following steps C1-C3, wherein:

C1: Obtain the anomaly score value of each unlabeled sample in the first sample set.

For step C1, reference may be made to the description of step B1, which will not be repeated here.

C2: Select L unlabeled samples from the first sample set according to the abnormal score value of each unlabeled sample in the first sample set.

This application does not limit L, and L is a positive integer. Reference can be made to the description of P, and details are not repeated here. In the embodiment of the present application, the abnormal score value of any unlabeled sample in the L unlabeled samples is less than or equal to the abnormal score value of any sample in the first sample set except for the L unlabeled samples. The present application does not limit the method for selecting L unlabeled samples, and the abnormal score values of each unlabeled sample in the first sample set may be sorted in descending order or ascending order. When the descending order is used as the first order, the unlabeled samples corresponding to the last L serial numbers in the first order can be obtained. In ascending order, the unlabeled samples corresponding to the first L serial numbers can be obtained. The method of selecting L unlabeled samples can also randomly select L reference unlabeled samples from the first sample set, and then compare the abnormal score values of the L reference unlabeled samples from the remaining unlabeled samples one by one. , so as to replace the larger unlabeled sample among the L reference unlabeled samples. It should be noted that the L unlabeled samples may include unlabeled samples with equal abnormal score values, and the unlabeled samples other than the L unlabeled samples and the K unlabeled samples in the first sample set may also be the same as the L unlabeled samples. The anomaly score values of the unlabeled samples among the unlabeled samples are equal. The L unlabeled samples can be understood as the most normal part of the unlabeled samples in the first sample set, and can also be understood as non-outlier points in the first sample set.

C3: Select M unlabeled samples from L unlabeled samples.

Among them, the M unmarked samples may be randomly selected from the L unmarked samples, or may be samples obtained in a recent period, or may be historical samples of the communication link, or may be the same as the communication link. Historical samples of the type of communication link, etc., are not limited here.

It can be understood that in steps C1-C3, the unlabeled samples selected according to the abnormal score values of the unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are Label samples as negative samples. That is to say, the newly added samples in the sample set for training the abnormal link detection model are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of the model to detect abnormal links.

It should be noted that the above two manners do not constitute limitations to the embodiments of the present application. In practical applications, M unlabeled samples may be selected in combination with the first manner and the second manner.

S106: Train the second model obtained from the previous training according to the labeled samples, K labeled samples, and M unlabeled samples in the first sample set, and obtain the first model when the training meets the preset condition.

Wherein, the preset conditions may refer to the definitions in the foregoing, which will not be repeated here. In a possible example, the preset conditions include at least one of the following: the accuracy of the second model is greater than or equal to the first threshold; the recall rate of the second model is greater than or equal to the second threshold; the improvement in accuracy is less than or equal to equal to the third threshold; the improvement of recall is less than or equal to the fourth threshold; the number of training times of the second model is greater than or equal to the fifth threshold; the accuracy of the second model is greater than or equal to the sixth threshold; the improvement of accuracy is less than or equal to equal to the seventh threshold; the harmonic mean corresponding to precision and recall is greater than or equal to the eighth threshold. In this way, it is determined whether the training of the second model is completed through different preset conditions, which can improve the accuracy of detecting abnormal links by the first model after the training is completed.

This application does not limit the training method of the second model, which can be based on gradient descent method, Newton algorithm, conjugate gradient method, quasi-Newton method, heuristic method (for example, simulated annealing method, genetic algorithm, ant colony algorithm and particle swarm algorithm, etc.) and other methods to adjust the parameters of the second model. Then, based on the above method, the parameters of the second model obtained last time are adjusted until it is determined that the training of the second model meets the preset conditions, and the training is determined to be completed, and the second model obtained after the training is completed is used as the first model.

In the method described in FIG. 1 , K unlabeled samples are first selected from the first sample set, and then M unlabeled samples are selected from the first sample set as negative samples. After the K unlabeled samples are labeled, the K labeled samples obtained from the labeling together with the M unlabeled samples and the labeled samples in the first sample set are used to train the second model obtained from the previous training, so that Get the first model that has been trained. In this way, by selecting samples from unlabeled samples, the model can learn the distribution of positive and negative samples in unlabeled samples during training. In addition, retraining the model obtained from the previous training according to the selected samples and the existing labeled samples can further improve the detection accuracy.

Referring to FIG. 2, FIG. 2 is an architecture diagram of a communication system provided by an embodiment of the present application. As shown in FIG. 2, the communication system may include a terminal (eg, terminal 211), a network node (eg, network node 221, network node 222), a detection node (eg, detection node 231), and a target device (eg, network device) 241. Application server 251). This embodiment of the present application does not limit the number of the above devices.

The communication system in the embodiments of the present application may be a communication system supporting a fourth generation (4G) access technology, for example, a long term evolution (long term evolution, LTE) access technology; or, the communication system may be a communication system supporting Fifth generation (5G) access technology communication system, for example, new radio (NR) access technology; or, the communication system may be a communication system supporting multiple wireless technologies, for example, supporting LTE technology and NR technology; or the communication system may support microwave communication technology, wavelength division communication technology, optical transport network (OTN) technology, wireless communication technology, broadband and narrowband technology, etc. In addition, the communication system can be adapted to future-oriented communication technologies.

The communication system can also be applied to other communication systems, such as C-V2X system, public land mobile network (PLMN), device-to-device (D2D) network, machine-to-machine (machine to machine, M2M) network, Internet of things (Internet of things, IoT), wireless local area network (wireless local area networks, WLAN) or other networks, etc., are not limited here.

In this embodiment of the present application, the terminal may be connected to the network node in a wireless or wired manner, and then connected to the target device via the network node in a wireless or wired manner. A terminal may be a device that provides voice or data connectivity to a user. A terminal may be referred to as a user equipment (UE), a mobile station, a subscriber unit, a station, or a terminal device. (terminal equipment, TE) etc. The terminal may be a cellular phone, a personal digital assistant (PDA), a wireless modem, a handheld, a laptop computer, a cordless phone, a wireless Local loop (wireless local loop, WLL) station, mobile phone (mobile phone), tablet computer (pad), etc. With the development of wireless communication technology, a device that can access a wireless communication network, communicate with a wireless network side, or communicate with other objects through a wireless network can be a terminal in the embodiments of the present application. For example, terminals and cars in intelligent transportation, household equipment in smart homes, power meter reading instruments in smart grids, voltage monitoring instruments, environmental monitoring instruments, video monitoring instruments in smart security networks, cash registers, etc. Terminals can be stationary or mobile. Exemplarily, as shown in FIG. 2 , the terminal is a mobile phone.

The network node in the embodiment of the present application is used to provide a transmission service for the terminal. A network node may act as a relay node (relay node, RN) as a node that provides wireless backhaul services for terminals, and wireless backhaul services refer to data and/or signaling backhaul services provided through wireless backhaul links. On the one hand, a relay node can provide wireless access services for terminals through an access link (AL); on the other hand, a relay node can use a one-hop or multi-hop backhaul link (BL) By connecting to the target device, the relay node can realize the forwarding of data and/or signaling between the terminal and the target device, thereby expanding the coverage of the communication system. Exemplarily, as shown in FIG. 2 , the network node is a relay node.

The target device in this embodiment of the present application is deployed in a communication link, and is an apparatus for providing a terminal with a wireless communication function. The target device can be a base station, an access point, a node, an evolved node (environment Bureau, eNB) or a 5G base station (next generation base station, gNB), which refers to communicating with wireless terminals through one or more sectors on the air interface devices in the access network. By converting received air interface frames into Internet Protocol (IP) packets, the base station can act as a router between the wireless terminal and the rest of the access network, which can include an Internet Protocol network. The base station may also coordinate the management of the attributes of the air interface.

The target device can also be an application server, for example, a server of an intelligent traffic system (ITS), a server of a navigation application, a server of a payment application, a server of a medical information system, a server of electronic information file management, etc. Do limit. Exemplarily, as shown in FIG. 2 , the target device includes an access network device and an application server.

The detection node in the embodiment of the present application is deployed in the communication link, and is used to monitor whether the communication link is an abnormal link. An anomaly detection model can be deployed on the detection node. The detection node can be a node deployed separately in the communication system or a node deployed by each network node, which is not limited here, and can be deployed according to the actual situation of the communication link . It can be understood that when the detection node is a node deployed on each network node, only one network node is detected, which can improve the detection efficiency. When the detection node is a node deployed separately in the communication system, the detection node can obtain the network data of any network node in the communication link, thereby comprehensively analyzing the entire communication link, which can improve the detection accuracy.

The detection node can be specifically used to obtain network data of at least one network node in the communication link; obtain network characteristics of the network data; and input the network characteristics into the abnormal link detection model to obtain the detection of whether the communication link is an abnormal link result. Please refer to FIG. 3 , which is a schematic structural diagram of a detection node according to an embodiment of the present application. As shown in FIG. 3, the detection node 300 may include an input module 301, a feature acquisition module 302, a detection training module 303, an output module 304, and the like.

Wherein, the input module 301 can be used to obtain network data of at least one network node in the communication link. The feature obtaining module 302 can be used to obtain network features of the network data. The detection and training module 303 can be used to detect the network characteristics to obtain the detection result of whether the communication link is an abnormal link. The detection training module 303 can also be used to train an abnormal link detection model. The output module 304 can be used to output the detection result. When the detection result is an abnormal link, it can also be reported by the output module 304 (it can be reported to pre-assigned business personnel, or it can be reported to the system, and the system assigns business personnel, etc., which is not limited here).

It should be noted that the abnormal link detection model provided in this application can be applied to any communication link. In other application scenarios of abnormal detection, the training data in the sample set for training the abnormal detection model can be the same as the embodiment of this application. The network data in the data are different, and the data characteristics of the training data may also be different from the network characteristics in the embodiment of the present application. The method for selecting the sample set can be selected by the method described in the embodiment of FIG. The described training method is used for training.

Please refer to FIG. 4. FIG. 4 is a schematic flowchart of an abnormal link detection method provided by an embodiment of the present application. The method can be applied to any communication network as described in FIG. 2, and the method can be performed by an abnormal link detection model or an abnormal link detection device, or a detection node or terminal, etc. The method includes but is not limited to the following steps:

S402: Receive network data of at least one network node in the communication link.

Wherein, the network data may include but not limited to performance data of network nodes and network topology information of communication links corresponding to the network nodes, etc., which are not limited herein. The network topology information is used to describe the connection relationship between each network node in the communication link. Performance data may include but not limited to at least one of the following information: signal-to-noise ratio, input signal level, errored seconds, severely errored seconds, unavailable time, skewness, etc. This will not be repeated here.

This application does not limit the execution conditions of step S402, which may be sent by the network node at regular intervals. The time may be a fixed time, which is the same for all network nodes, or may be each network node. A different time corresponding to it; it can also be a dynamic time determined by the abnormal link detection model or the abnormal link detection device, or the detection node or terminal, etc. The time can be determined according to the channel quality, network load, etc., This is not limited. The network data of the network node may be sent when a constraint condition is met, and the constraint condition may include the transmission of new services, the termination or suspension of services or the inability to transmit, the number of transmitted services exceeding a threshold, and the like. The network data of the network node may be sent after receiving the request sent by the execution subject for acquiring the network data of the network node, or the like.

S404: Obtain network features corresponding to the network data.

Among them, the network characteristics are used to describe the performance characteristics of the communication link, which can be obtained by statistical analysis based on network data of different dimensions, or obtained through a network embedding method.

S406: Input the network feature into the first model to obtain the detection result of the communication link.

In this embodiment of the present application, the detection result of the communication link is used to indicate whether the communication link is an abnormal link. The first model is based on the labeled samples, K labeled samples, and M unlabeled samples in the first sample set, training the second model obtained from the previous training, and the model obtained when the training meets the preset conditions, The K labeled samples are obtained by labeling the K unlabeled samples in the first sample set respectively, and the M unlabeled samples are the unlabeled samples selected from the first sample set as negative samples. This set includes pre-stored labeled samples and unlabeled samples before selecting K unlabeled samples and M unlabeled samples. For the selection method of the K unlabeled samples and the M unlabeled samples, and the training method of the first model, reference may be made to the method described in FIG. 1 , which will not be repeated here.

In a possible example, before step S406, the method further includes: acquiring network topology information of the communication link; using a pre-stored set of unlabeled samples and labeled samples corresponding to the network topology information as the first sample set. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

Wherein, the network topology information may be obtained from the network data of the network node received in step S402, or may be obtained from the network data of the network nodes in the previously obtained communication link, or may be obtained from the pre-stored data of the communication link. It can be obtained from network topology information, etc., which is not limited here.

In a possible example, the preset conditions include at least one of the following: the accuracy of the second model is greater than or equal to the first threshold; the recall rate of the second model is greater than or equal to the second threshold; the improvement in accuracy is less than or equal to equal to the third threshold; the improvement of recall is less than or equal to the fourth threshold; the number of training times of the second model is greater than or equal to the fifth threshold; the accuracy of the second model is greater than or equal to the sixth threshold; the improvement of accuracy is less than or equal to equal to the seventh threshold; the harmonic mean corresponding to precision and recall is greater than or equal to the eighth threshold. In this way, it is determined whether the training of the second model is completed through different preset conditions, which can improve the accuracy of detecting abnormal links by the first model after the training is completed.

In a possible example, before step S406, the method further includes: acquiring the abnormal score value of each unlabeled sample in the first sample set; Arrange in descending order to obtain the first order; take the unlabeled samples corresponding to the first K serial numbers in the first order as the K unlabeled samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, before step S406, the method further includes: taking the unmarked samples corresponding to the last L serial numbers in the first sorting as the L unmarked samples; selecting M unmarked samples from the L unmarked samples Label samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, before acquiring the abnormal score value of each unlabeled sample in the first sample set, the method further includes: acquiring the abnormal score value of each unlabeled sample in the second sample set, and the second sample set Including the pre-stored marked samples and unmarked samples before selecting the P unmarked samples; performing descending sorting according to the abnormal score value of each unmarked sample in the second sample set to obtain the second sorting; The unlabeled samples corresponding to the serial numbers are taken as P unlabeled samples; the third model is constructed according to the labeled samples and P labeled samples in the second sample set, and the P labeled samples are obtained by labeling the P unlabeled samples respectively , and the third model is the initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate. In one possible example, M is equal to the labeled samples in the first sample set and the number of positive samples in the K labeled samples. In this way, the number of new negative samples in the training sample set of the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, and improve the accuracy of detecting abnormal links. .

In the method described in FIG. 4 , after the network data of the network nodes in the communication link is received, the network characteristics of the network data are obtained, and then the network characteristics are input into, through the samples selected from the unmarked samples and existing The abnormal link detection model obtained by training the marked samples of , so as to detect the communication link and improve the accuracy of detecting abnormal links.

The methods of the embodiments of the present application are described in detail above, and the apparatuses of the embodiments of the present application are provided below.

Consistent with the embodiment shown in FIG. 1, please refer to FIG. 5. FIG. 5 is a schematic structural diagram of a model training apparatus provided by an embodiment of the present application. The model training apparatus 500 may include a selection module 501 and a training model 502, wherein :

The selection module 501 is used to select K unmarked samples from the first sample set; and select M unmarked samples as negative samples from the first sample set, the first sample set includes selecting K unmarked samples and M unmarked samples Pre-stored labeled samples and unlabeled samples before unlabeled samples;

The training module 502 is configured to train the second model obtained from the previous training according to the marked samples, K marked samples and M unmarked samples in the first sample set, and obtain the first model when the training meets the preset conditions. In the model, the K labeled samples are obtained by labeling the K unlabeled samples in the first sample set respectively.

In a possible example, the selection module 501 is specifically configured to obtain the abnormal score value of each unlabeled sample in the first sample set; and perform descending order according to the abnormal score value of each unlabeled sample in the first sample set to obtain the first sample set. 1st sorting; take the unmarked samples corresponding to the first K serial numbers in the first sorting as K unmarked samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, the selection module 501 is specifically configured to use the unmarked samples corresponding to the last L serial numbers in the first sorting as the L unmarked samples; and select M unmarked samples from the L unmarked samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, the selection module 501 is specifically configured to count the marked samples in the first sample set and the number of positive samples in the K marked samples; according to the number of positive samples, select M samples from the first sample set Unlabeled samples are taken as negative samples, and M is equal to the number of labeled samples in the first sample set and the number of positive samples in the K labeled samples. That is to say, the number of new negative samples in the sample set for training the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, and improve the effect of model training. , which is convenient to improve the accuracy of detecting abnormal links.

In a possible example, the selection module 501 is further configured to obtain the abnormal score value of each unlabeled sample in the second sample set, where the second sample set includes pre-stored marked samples and unlabeled samples before selecting P unlabeled samples. Mark the samples; perform descending sorting according to the abnormal score value of each unmarked sample in the second sample set to obtain a second ranking; take the unmarked samples corresponding to the first P serial numbers in the second sorting as P unmarked samples; according to the second sample The marked samples in the set and the P marked samples construct a third model, the P marked samples are obtained by marking the P unmarked samples respectively, and the third model is an initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate.

In a possible example, the network data includes at least one of the following: signal-to-noise ratio, level of an input signal, errored seconds, severely errored seconds, unavailable time, and network topology information. In this way, abnormal link detection is performed through different network data, which can improve the diversity of detection.

In the device shown in FIG. 5 , K unlabeled samples are first selected from the first sample set, and then M unlabeled samples are selected from the first sample set as negative samples. After the K unlabeled samples are labeled, the K labeled samples obtained from the labeling together with the M unlabeled samples and the labeled samples in the first sample set are used to train the second model obtained from the previous training, so that Get the first model that has been trained. In this way, by selecting samples from unlabeled samples, the model can learn the distribution of positive and negative samples in unlabeled samples during training. In addition, retraining the model obtained from the previous training according to the selected samples and the existing labeled samples can further improve the detection accuracy.

Consistent with the embodiment shown in FIG. 4 , please refer to FIG. 6 . FIG. 6 is a schematic structural diagram of an abnormal link detection apparatus provided by an embodiment of the present application. The abnormal link detection apparatus 600 may include a communication unit 601 and Processing unit 602, wherein:

The communication unit 601 is configured to receive network data of at least one network node in the communication link;

The processing unit 602 is used to obtain the network characteristics corresponding to the network data; the network characteristics are input into the first model to obtain the detection result of the communication link, and the detection result is used to indicate whether the communication link is an abnormal link, and the first model is based on the first model. The labeled samples, K labeled samples, and M unlabeled samples in the sample set are trained on the second model obtained from the previous training, and the model obtained when the training meets the preset conditions, the K labeled samples are correct The K unlabeled samples in the first sample set are respectively marked, and the first sample set includes the unlabeled samples and the labeled samples stored in advance before the K unlabeled samples and the M unlabeled samples are selected.

In a possible example, the processing unit 602 is further configured to obtain the abnormal score value of each unlabeled sample in the first sample set; and perform descending order according to the abnormal score value of each unlabeled sample in the first sample set to obtain the first sample set. 1st sorting; take the unmarked samples corresponding to the first K serial numbers in the first sorting as K unmarked samples. That is to say, the selected samples to be marked are the most abnormal K unmarked samples in the first sample set, and the sample set for training the abnormal link detection model includes samples that may be abnormal, which can improve the effect of model training and facilitate the training of the model. Improve the accuracy of detecting abnormal links.

In a possible example, the processing unit 602 is further configured to use the unlabeled samples corresponding to the last L serial numbers in the first sorting as the L unlabeled samples; and select M unlabeled samples from the L unlabeled samples. That is to say, the selected unlabeled samples are M unlabeled samples randomly selected from the most normal L unlabeled samples in the first sample set, and the unlabeled samples are regarded as negative samples, then the abnormal link detection model is trained The newly added samples in the sample set are samples of normal links, which can avoid introducing noise, improve the effect of model training, and facilitate the accuracy of detecting abnormal links.

In a possible example, the processing unit 602 is further configured to obtain an abnormal score value of each unlabeled sample in the second sample set, where the second sample set includes pre-stored labeled samples and unlabeled samples before selecting the P unlabeled samples. Mark the samples; perform descending sorting according to the abnormal score value of each unmarked sample in the second sample set to obtain a second ranking; take the unmarked samples corresponding to the first P serial numbers in the second sorting as P unmarked samples; according to the second sample The marked samples in the set and the P marked samples construct a third model, the P marked samples are obtained by marking the P unmarked samples respectively, and the third model is an initialization model corresponding to the first model and the second model. In this way, the initialization model of the abnormal link detection model is constructed based on the P marked samples corresponding to the most abnormal P unmarked samples and the existing marked samples, which can improve the effect of model training and improve the accuracy of detecting abnormal links. Rate.

In one possible example, M is equal to the labeled samples in the first sample set and the number of positive samples in the K labeled samples. In this way, the number of new negative samples in the sample set for training the abnormal link detection model is equal to the number of positive samples in the sample set, which can relatively achieve a balance between positive and negative samples, reduce label noise, improve the effect of model training, and facilitate Improve the accuracy of detecting abnormal links.

In a possible example, the processing unit 602 is further configured to acquire network topology information of the communication link; the pre-stored set of unmarked samples and marked samples corresponding to the network topology information and device information is taken as the first sample this episode. That is to say, selecting unlabeled samples and labeled samples corresponding to the network topology information of the communication link as the samples to be selected for training can improve the effect of model training and facilitate the detection of whether the communication link is an abnormal link 's accuracy.

In the device described in FIG. 6 , after receiving the network data of the network nodes in the communication link, the network characteristics of the network data are obtained, and then the network characteristics of the network data are obtained by training the samples selected from the unlabeled samples and the existing labeled samples. The abnormal link detection model detects the communication link and improves the detection accuracy.

Referring to FIG. 7 , FIG. 7 is a device 700 provided by an embodiment of the present application. The device 700 includes a processor 701 , a memory 702 and a communication interface 703 , and the processor 701 , the memory 702 and the communication interface 703 are connected to each other through a bus 704 .

The memory 702 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM), or A portable read-only memory (compact disc read-only memory, CD-ROM), the memory 702 is used for related computer programs and data. The communication interface 703 is used to receive and transmit data.

The processor 701 may be a device with processing functions, and may include one or more processors. The processor may be a general-purpose processor or a special-purpose processor, or the like. The processor may be a baseband processor, or a central processing unit. The baseband processor can be used to process the communication protocol and communication data, and the central processing unit can be used to control the communication device, execute the software program, and process the data of the software program.

The processor 701 in the device 700 is configured to read the computer program code stored in the memory 702. In this embodiment of the present application, the device 700 may include an abnormal link detection device, or a model training device, or a detection device node or any other possible device.

When the device 700 is a model training device or a detection node, the processor 701 is configured to perform the following operations:

Select K unlabeled samples from the first sample set;

Select M unlabeled samples from the first sample set as negative samples;

According to the labeled samples, K labeled samples, and M unlabeled samples in the first sample set, the second model obtained from the previous training is trained, and the first model is obtained when the training meets the preset conditions, and the K labeled samples are The labeled samples are obtained by labeling the K unlabeled samples in the first sample set respectively. The first sample set includes the pre-stored labeled samples and unlabeled samples before selecting K unlabeled samples and M unlabeled samples. sample.

In a possible example, the processor 701 is specifically configured to perform the following operations:

Obtain the abnormal score value of each unlabeled sample in the first sample set;

The first sorting is obtained by performing descending sorting according to the abnormal score value of each unlabeled sample in the first sample set;

The unlabeled samples corresponding to the first K serial numbers in the first sorting are regarded as K unlabeled samples.

Taking the unlabeled samples corresponding to the last L serial numbers in the first sorting as the L unlabeled samples;

Pick M unlabeled samples from L unlabeled samples.

Count the labeled samples in the first sample set and the number of positive samples in the K labeled samples;

According to the number of positive samples, M unlabeled samples are selected from the first sample set as negative samples, where M is equal to the number of positive samples.

In a possible example, before selecting K unlabeled samples from the first sample set, the processor 701 is further configured to perform the following operations:

Obtain the network topology information of the communication link to be detected;

A pre-stored set of unlabeled samples and labeled samples corresponding to the network topology information is used as the first sample set. When the device 700 is an abnormal link detection apparatus, the processor 701 is configured to perform the following operations:

receiving network data for at least one network node in the communication link;

Obtain network features corresponding to network data;

Input the network feature into the first model to obtain the detection result of the communication link. The detection result is used to indicate whether the communication link is an abnormal link. The first model is based on the marked samples and K marked samples in the first sample set. and M unlabeled samples, the second model obtained from the previous training is trained, and the model obtained when the training meets the preset conditions, the K labeled samples are performed on the K unlabeled samples in the first sample set respectively. The M unlabeled samples are the unlabeled samples selected as negative samples from the first sample set. The first sample set includes the pre-stored samples before selecting K unlabeled samples and M unlabeled samples. Labeled and unlabeled samples.

In a possible example, before inputting the network features into the first model, the processor 701 is further configured to perform the following operations:

In a possible example, the processor 701 is further configured to perform the following operations:

Pick M unlabeled samples from L unlabeled samples.

In one possible example, M is equal to the labeled samples in the first sample set and the number of positive samples in the K labeled samples.

Obtain the network topology information of the communication link;

The pre-stored set of unlabeled samples and labeled samples corresponding to the network topology information is taken as the first sample set.

When the apparatus 700 may include an abnormal link detection device, or a model training device, or a detection node or any other possible device, in a possible example, in the first sample set obtained, each Before the abnormal score value of the unlabeled sample, the processor 701 is further configured to perform the following operations:

Acquiring the abnormal score value of each unlabeled sample in the second sample set, where the second sample set includes pre-stored labeled samples and unlabeled samples before selecting the P unlabeled samples;

The second sorting is obtained by performing descending sorting according to the abnormal score value of each unlabeled sample in the second sample set;

Taking the unlabeled samples corresponding to the first P serial numbers in the second sorting as P unlabeled samples;

A third model is constructed according to the marked samples and P marked samples in the second sample set. The P marked samples are obtained by marking the P unmarked samples respectively, and the third model corresponds to the first model and the second model. initialized model.

In a possible example, the network data includes at least one of the following: signal-to-noise ratio, level of an input signal, errored seconds, severely errored seconds, unavailable time, and network topology information.

It should be noted that, the implementation of each operation may also correspond to the corresponding description with reference to the method embodiments shown in FIG. 1 and FIG. 4 .

An embodiment of the present application further provides a chip, including a processor and a memory, where the processor is used to call and run instructions stored in the memory from the memory, so that a device with the chip installed executes any of the methods shown in FIG. 1 and FIG. 4 . .

The embodiment of the present application also provides another chip, including: an input interface, an output interface, and a processing circuit. The input interface, the output interface, and the processing circuit are connected through an internal connection path. any method shown.

The embodiment of the present application also provides another chip, including: an input interface, an output interface, a processor, and optionally, a memory. The input interface, the output interface, the processor, and the memory are connected through an internal connection path, and the processing The processor is used to execute code in the memory, and when the code is executed, the processor is used to perform any of the methods shown in FIGS. 1 and 4 .

The embodiments of the present application further provide a chip system, including at least one processor, a memory and an interface circuit, the memory, the transceiver and the at least one processor are interconnected by lines, and at least one memory stores a computer program; the computer program is executed by the processor , the method flow shown in FIG. 1 and FIG. 4 is realized.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program runs on a computer, the method flows shown in FIG. 1 and FIG. 4 are implemented.

Embodiments of the present application further provide a computer program product, and when the computer program product runs on a computer, the method flows shown in FIG. 1 and FIG. 4 are implemented.

To sum up, by implementing the embodiments of the present application, the first model is obtained by first training the second model obtained from the previous training according to the samples selected from the unlabeled samples and the existing labeled samples. After receiving the network data of the network nodes in the communication link, the network characteristics of the network data are obtained, and then the network characteristics are input into the first model to obtain the detection result of whether the communication link is an abnormal link, which improves the detection performance. 's accuracy.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented, and the process can be completed by a computer program or computer program-related hardware, and the computer program can be stored in a computer-readable storage medium. During execution, the processes of the foregoing method embodiments may be included. The aforementioned storage medium includes: ROM or random storage memory RAM, magnetic disk or optical disk and other mediums that can store computer program codes.

Claims

A method for detecting abnormal links, comprising:

receiving network data for at least one network node in the communication link;

acquiring network features corresponding to the network data;

Inputting the network characteristics into a first model to obtain a detection result of the communication link, where the detection result is used to indicate whether the communication link is an abnormal link, and the first model is based on a first sample The labeled samples, K labeled samples, and M unlabeled samples in the set are trained on the second model obtained from the previous training, and the model obtained when the training meets the preset conditions, the K labeled samples are pairs of The K unlabeled samples in the first sample set are obtained by marking them respectively, and the M unlabeled samples are unlabeled samples selected from the first sample set as negative samples. The sample set includes pre-stored labeled samples and unlabeled samples before the K unlabeled samples and the M unlabeled samples are selected.
The method according to claim 1, wherein before the inputting the network feature into the first model, the method further comprises:

obtaining the abnormal score value of each unlabeled sample in the first sample set;

Arrange in descending order according to the abnormal score value of each unlabeled sample in the first sample set to obtain the first order;

The unlabeled samples corresponding to the first K sequence numbers in the first sorting are used as the K unlabeled samples.
The method according to claim 2, wherein the method further comprises:

Taking the unlabeled samples corresponding to the last L serial numbers in the first sorting as the L unlabeled samples;

The M unlabeled samples are selected from the L unlabeled samples.
The method according to any one of claims 1-3, wherein the M is equal to the labeled samples in the first sample set and the number of positive samples in the K labeled samples.
The method according to any one of claims 2-4, characterized in that before acquiring the abnormal score value of each unlabeled sample in the first sample set, the method further comprises:

obtaining an abnormal score value of each unlabeled sample in a second sample set, where the second sample set includes pre-stored labeled samples and unlabeled samples before selecting P unlabeled samples;

Arrange in descending order according to the abnormal score value of each unlabeled sample in the second sample set to obtain the second order;

Taking the unlabeled samples corresponding to the first P serial numbers in the second sorting as the P unlabeled samples;

A third model is constructed according to the marked samples in the second sample set and P marked samples, the P marked samples are obtained by marking the P unmarked samples respectively, and the third model is Initialization models corresponding to the first model and the second model.
The method according to any one of claims 1-5, wherein before the inputting the network feature into the first model, the method further comprises:

obtaining network topology information of the communication link;

A pre-stored set of unlabeled samples and labeled samples corresponding to the network topology information is used as the first sample set.
The method according to any one of claims 1-6, wherein the network data includes at least one of the following: signal-to-noise ratio, level of an input signal, errored seconds, severely errored seconds, unavailable Time, skewness, network topology information.
An abnormal link detection device, characterized in that it includes:

a communication unit for receiving network data of at least one network node in the communication link;

a processing unit, configured to obtain network characteristics corresponding to the network data; input the network characteristics into the first model to obtain a detection result of the communication link, where the detection result is used to indicate whether the communication link is a Abnormal link, the first model is to train the second model obtained from the previous training according to the marked samples, K marked samples and M unmarked samples in the first sample set. condition, the K labeled samples are obtained by labeling the K unlabeled samples in the first sample set, and the M unlabeled samples are obtained from the first sample set The selected unlabeled samples as negative samples, the first sample set includes pre-stored labeled samples and unlabeled samples before selecting the K unlabeled samples and the M unlabeled samples.
An apparatus comprising a processor and a memory and a communication interface connected to the processor, wherein the memory is used to store one or more programs and is configured to be executed by the processor, the The program includes instructions for performing steps in the method of any of claims 1-7.
A computer storage medium, characterized by comprising computer instructions, when the computer instructions are executed on a terminal, the terminal is made to execute the method for executing a command according to any one of claims 1-7.