Disclosure of Invention
The embodiment of the invention provides a method for checking data in a side cloud cooperation process, which is used for solving the problems of low data checking efficiency and low success rate in the side cloud cooperation process in the prior art.
The embodiment of the invention provides a method for checking data in a side cloud cooperation process, which is applied to a side cloud network architecture, wherein the side cloud network architecture comprises a center cloud, an intelligent edge cloud and a user terminal (UE), the center cloud comprises an infrastructure as a service (IaaS) layer, a software as a service (SaaS) layer and a platform as a service (PaaS) layer, the IaaS layer comprises a center cloud manager, the intelligent edge cloud comprises an edge infrastructure as a service (ECIaaS) layer, the software as a service (ECSaaS) layer and the platform as a service (ECPaaS) layer, the ECIaaS layer comprises an edge cloud manager, and the method comprises the following steps:
the central cloud manager sends a data acquisition instruction to the edge cloud manager;
the edge cloud manager returns multidimensional network element data stored in the ECIaaS layer to the central cloud manager;
the center cloud manager compares the network element data stored in the center cloud IaaS layer with the network element data stored in the ECIaaS layer;
if the comparison result is inconsistent, starting an abnormality detection algorithm to perform abnormality detection on the network element data stored in the center cloud IaaS layer and/or the network element data stored in the ECIaaS layer;
and if the network element data is abnormal, performing abnormality remediation.
Optionally, the central cloud manager compares the network element data stored in the central cloud IaaS layer with the network element data stored in the ECIaaS layer, including:
the center cloud manager acquires all network element data of the center cloud IaaS layer and all network element data of the ECIaaS layer in batches, and the network element data are defined as first network element data;
intercepting second network element data from the first network element data through a time sliding window, wherein the data volume of the second network element data is smaller than that of the first network element data;
acquiring an ID and a performance index value of the second network element data through a cursor, wherein the performance index comprises a performance index value stored in a central cloud IaaS layer and a performance index value stored in an ECIaaS layer;
and comparing the performance index value stored in the center cloud IaaS layer with the performance index value stored in the ECIaaS layer.
Optionally, before the edge cloud manager returns the multidimensional network element data stored in the ECIaaS layer to the central cloud manager, the method further includes:
after receiving the data acquisition instruction, the edge cloud manager determines whether the edge cloud IaaS layer has service backlog;
if no service backlog exists, the edge cloud manager dynamically acquires instruction parameters of each network element of the edge cloud;
the edge cloud manager sends a network element data request to each network element of the edge cloud, wherein the network element data request comprises network element performance parameters corresponding to the instruction parameters;
the edge cloud receives network element data sent by each network element of the edge cloud, wherein the network element data comprises a network element ID and a corresponding network performance index value, and the network performance index value is a real-time value corresponding to the network element performance parameter;
setting up a check rule, analyzing and extracting network element data associated with the check rule, and filtering out network element data irrelevant to the check rule.
Optionally, the check rule includes a range of selecting network elements, a range of selecting network element indexes, a time window and a check frequency.
Optionally, the starting an anomaly detection algorithm performs anomaly detection on network element data stored in the central cloud IaaS layer and/or network element data stored in the ECIaaS layer, including:
acquiring network element data stored in the center cloud IaaS layer and/or a time sequence of the network element data stored in the ECIaaS layer, wherein the time sequence is defined as a first time sequence;
extracting alarm characteristics of the first time sequence through a word2vec model and a seq2seq model;
mining causal relation among the alarm features by using a maximum and minimum mountain climbing MMHC, and acquiring a causal relation matrix among the features;
and carrying out alarm root cause detection on the causality matrix by using the schematic force neural network.
Optionally, the mining the causal relationship between the alarm features using the maximum minimum hill climbing MMHC includes:
sampling the alarm characteristics and extracting sample information;
based on the sample information, constructing a framework of a Bayesian network by utilizing a local causal discovery algorithm MMPC;
performing scoring search through a greedy search algorithm, and determining edges of the network structure and directions of the edges;
based on the edges of the network structure and the direction of the edges, a causal relation graph between variables is generated and is converted into a causal relation matrix between alarms.
Optionally, the alarm root cause detection for the causal relation matrix by using the graph annotation force neural network includes:
taking the alarm feature and the causality matrix as inputs of a graph attention neural network;
acquiring alarm root cause parameters and corresponding weights of the alarm features through the graph annotation force network;
carrying out association analysis on the alarm root cause parameters through an association relation algorithm to obtain alarm association root cause factors;
and sequentially inputting the alarm root factor parameters, the corresponding weights and the associated root factor into a softmax classifier to obtain the probabilities of different alarm root factors, and taking the alarm root factor with the highest probability as the abnormal alarm root factor.
Optionally, the anomaly remediation includes:
and restarting the network element or updating the data of the edge cloud.
Optionally, the network element data includes network element node IP, an original directory, a target directory, a file name, a matching rule, a collection period, a collection start time, and a collection mode.
Optionally, before the central cloud manager sends the data acquisition instruction to the edge cloud manager, the method further includes:
the central cloud manager sends a handshake request to the edge cloud manager;
the edge cloud manager sends a handshake response to the central cloud manager, wherein the handshake response carries an edge cloud certificate;
the central cloud manager checks the edge cloud certificate, and registers the edge cloud after the edge cloud certificate is checked successfully;
and the central cloud manager sends a registration success message to the edge cloud manager to finish the initialization process of the edge cloud interaction request.
The embodiment of the invention also comprises a device which is characterized by comprising a memory and a processor, wherein the memory stores computer executable instructions, and the processor realizes the method when running the computer executable instructions on the memory.
According to the method provided by the embodiment of the invention, the network element data of the edge cloud is compared with the network element data of the center cloud, if the network element data are inconsistent, data interference possibly exists, at the moment, root cause tracing is needed, namely, an anomaly detection algorithm is started to carry out anomaly detection, the data interference caused by what cause is confirmed, and automatic remedial measures are carried out.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
Fig. 1 is an edge cloud network architecture provided by the embodiment of the present invention, as shown in fig. 1, where an edge cloud network architecture 10 includes a central cloud 11, an intelligent edge cloud 12, and a user terminal UE13, where the central cloud is a cloud, and has strong data storage and processing capability, and includes an infrastructure as a service IaaS layer, a software as a service SaaS layer, and a platform as a service PaaS layer, and in the IaaS layer, the central cloud further includes a central cloud manager, which is configured to perform basic operations of a database such as transceiving, storing, extracting, and the like, on data of the central cloud. The intelligent edge cloud 12 is an edge cloud with artificial intelligent AI capability, the edge cloud refers to edge computing of an edge cloud form, middle-small scale cloud service or cloud-like service capability is built on an edge side, main capability providing and processing of core business logic mainly depend on the edge cloud, the edge cloud is located on an edge layer, an edge infrastructure as a service ECIaaS layer, software as a service ECSaaS layer and a platform as a service ECPaaS layer are included in the edge cloud architecture, and an edge cloud manager is included in the ECIaaS layer. The UE13 includes all UE devices with cloud wired or wireless access capability, including but not limited to mobile terminals, PCs, tablet computers, cameras, VR, AR, etc.
Fig. 2 is a flowchart of a method for performing data verification in an edge cloud collaboration process according to an embodiment of the present invention, where the method provided by the embodiment of the present invention specifically includes:
s101, the central cloud manager sends a data acquisition instruction to the edge cloud manager;
the central cloud manager is responsible for receiving and transmitting data and storing the data, and in the process of edge cloud cooperation, the central cloud manager needs to perform data verification work of the edge cloud and the central cloud, so that the central cloud manager needs to acquire network element data from the edge cloud and perform network element data comparison.
In the embodiment of the invention, the central cloud manager sends the data acquisition instruction to the edge cloud manager, the instruction protocol can be in a TCP/IP format, and for convenient management, the normal data receiving and transmitting of a service layer are not influenced as much as possible, and the instruction sending process can be periodically carried out, for example, data sending is carried out every week or every double weeks.
Before S101, the central cloud manager and the edge cloud manager also need to perform an initialization operation, which specifically includes the following steps:
the central cloud manager sends a handshake request to the edge cloud manager;
the edge cloud manager sends a handshake response to the central cloud manager, wherein the handshake response carries an edge cloud certificate, and the edge cloud certificate is used for carrying out authentication and verification of the edge cloud;
the central cloud manager checks the edge cloud certificate, and successfully registers the edge cloud after the edge cloud certificate is checked successfully, namely, the information of the edge cloud is registered in a management list of the central cloud manager;
and the central cloud manager sends a registration success message to the edge cloud manager to finish the initialization process of the edge cloud interaction request.
S102, the edge cloud manager returns multidimensional network element data stored in the ECIaaS layer to the central cloud manager;
after receiving the data acquisition instruction sent by the central cloud manager, the edge cloud manager needs to acquire the multidimensional network element data and returns the multidimensional network element data to the central cloud manager. Wherein the multidimensional network element data comprises data of different attributes or different parameters of different network elements. For example, the multidimensional network element data may include a network element ID, a network element node IP, an original directory, a target directory, a file name, a matching rule, a collection period, a collection start time, a collection manner, and the like.
It should be noted that, the edge cloud manager performs real-time query from the database of the ECIaaS by using the query statement and obtains different network element data, so that for convenience in data verification, the edge cloud manager does not need to query and send all network element data, but sends part of the network element data in time and stages, so that the efficiency of subsequent data comparison can be improved.
In addition, before the S102 returns the multidimensional network element data stored in the ECIaaS layer, the edge cloud manager determines whether the edge cloud IaaS layer has traffic backlog after receiving a data acquisition instruction; if no service backlog exists, the edge cloud manager dynamically acquires instruction parameters of each network element of the edge cloud, wherein the instruction parameters are generated by a preset protocol, and the purpose is to acquire the data of each network element of the edge cloud; the edge cloud manager sends a network element data request to each network element of the edge cloud, wherein the network element data request comprises network element performance parameters corresponding to the instruction parameters; the edge cloud receives network element data sent by each network element of the edge cloud, wherein the network element data comprises a network element ID and a corresponding network performance index value, and the network performance index value is a real-time value corresponding to the network element performance parameter; setting up a check rule, analyzing and extracting network element data associated with the check rule, and filtering out network element data irrelevant to the check rule. The check rule comprises a range for selecting network elements, a range for selecting network element indexes, a time window and check frequency.
S103, the central cloud manager compares the network element data stored in the central cloud IaaS layer with the network element data stored in the ECIaaS layer;
in S103, the network element data stored in the central cloud IaaS layer and the network element data center stored in the edge ECIaaS need to keep consistency, if the network element data are inconsistent, the subsequent data are confused, the data are unreadable or unwritable, and the different equipment downtime of the edge cloud is seriously caused. Therefore, if inconsistent conditions exist, data auditing is needed to eliminate data problems caused by abnormal network conditions and respond timely. In the embodiment of the present invention, as shown in fig. 3, the network element data comparison specifically includes the following steps:
s1031, acquiring all network element data of a center cloud IaaS layer and an ECIaaS layer in batches by a cloud manager, and defining a set of the network element data as first network element data;
optionally, the first network element data also needs to be subjected to data preprocessing to filter out unnecessary data noise.
S1032, intercepting second network element data in the first network element data through a time sliding window, wherein the data volume of the second network element data is smaller than that of the first network element data; because the first network element data is more, the data interception needs to be performed through a time sliding window, and the time sliding window can be set to be one minute or one hour, and the data interception can be performed through the window, so that the light-weight second network element data is formed.
S1033, acquiring an ID and a performance index value of the second network element data through a cursor (cursor), wherein the performance index comprises a performance index value stored in a center cloud IaaS layer and a performance index value stored in an ECIaaS layer;
cursors are a mechanism that can extract records one at a time from a result set that includes multiple data records. I.e. the cursor is used to read the result set row by row. The cursor acts as a pointer. Although the cursor can traverse all rows in the result, it points to only one row at a time. In general terms, a cursor of SQL is a temporary database object, either a copy of a data row that is available for storage in a database table or a pointer to a data row stored in a database. Cursors provide a way to manipulate data in a table on a row-by-row basis. One common use of cursors is to save query results for later use. The result set of cursors is generated by SELECT statements, and if the process requires reuse of a record set, then the cursors are created once and reused several times much faster than the database is queried again. Thus, the cursor is primarily used to loop through the result set.
In S1033, ID and performance index (also called key performance index, key Performance Indicator, KPI) values of the second network element data are obtained by the cursor. The performance index is a key parameter for measuring one device or server, and different KPIs need to be monitored in real time in the daily operation and maintenance process, so that the consistency maintenance of the KPIs in the edge cloud cooperation process is particularly important.
S1034, comparing the performance index value stored in the center cloud IaaS layer with the performance index value stored in the ECIaaS layer. Namely, the KPIs in the IaaS are respectively compared with the KPIs of the same network elements in the ECIaaS, if the comparison is successful, the network elements are traversed to the next network element, and whether the KPIs of the network elements are the same in the IaaS layer and the ECIaaS layer is continuously compared.
S104, if the comparison result is inconsistent, starting an abnormality detection algorithm to perform abnormality detection on the network element data stored in the center cloud IaaS layer and/or the network element data stored in the ECIaaS layer;
specifically, an abnormality detection algorithm class adopts an alarm root cause identification method of causal network mining and graph annotation meaning force network, and the method combines a maximum and minimum mountain climbing method (MMHC) and graph annotation meaning force algorithm to accurately identify root cause alarms of alarm data; the former is used for mining causal relation networks between alarms, and the latter is used for training and learning the model by combining the existing causal graph and the characteristics of alarm data. Specifically, as shown in fig. 4, the method includes:
s1041, acquiring network element data stored in the center cloud IaaS layer and/or a time sequence of the network element data stored in the ECIaaS layer, wherein the time sequence is defined as a first time sequence;
s1042, extracting alarm characteristics of the first time sequence through a word2vec model and a seq2seq model; for example, if two performance parameters, SNR and RSRQ, exist in the first time sequence, and if the performance parameters of SNR and RSRQ increase at the same time, and it is confirmed that there is a correlation between them, it is necessary to extract the performance parameters of the two different types of data as alarm features. The word2vec model and the seq2seq model for alarm feature extraction belong to the prior art and are not described here.
S1043, mining causal relation among the alarm features by using a maximum and minimum mountain climbing MMHC, and acquiring a causal relation matrix among the features;
the MMHC algorithm combines the thought of a sparse candidate algorithm based on constraint space, and utilizes a local causal discovery algorithm MMPC and a greedy search algorithm to process data. In particular, the method comprises the steps of,
sampling the alarm characteristics and extracting sample information; specifically, the sampling rate may be customized;
based on the sample information, constructing a framework of a Bayesian network by utilizing a local causal discovery algorithm MMPC;
performing scoring search through a greedy search algorithm, and determining edges of the network structure and directions of the edges;
based on the edges of the network structure and the direction of the edges, a causal relation graph between variables is generated and is converted into a causal relation matrix between alarms.
S1044, detecting the alarm root cause of the causality matrix by using a graph annotation force neural network.
Taking the alarm feature and the causality matrix as inputs of a graph attention neural network;
acquiring alarm root cause parameters and corresponding weights of the alarm features through the graph annotation force network GAT; the weight setting can be dynamically adjusted according to the occurrence times of the alarm root cause, the failure rate, the bad influence level and other super parameter setting;
carrying out association analysis on the alarm root cause parameters through an association relation (for example, co-correlation) algorithm to obtain alarm association root cause factors; as above, for example, if the parameter of the correlation between RSRQ and SNR increases or decreases, then it is defined that the two types of parameters belong to the correlation root factor.
And sequentially inputting the alarm root factor parameters, the corresponding weights and the associated root factor into a softmax classifier to obtain the probabilities of different alarm root factors, and taking the alarm root factor with the highest probability as the abnormal alarm root factor.
S105, if the network element data is abnormal, performing abnormality remediation.
The anomaly remediation may be manual or automatic, such as network element restart or data update of the edge cloud.
The embodiment of the invention also comprises a device which is characterized by comprising a memory and a processor, wherein the memory stores computer executable instructions, and the processor realizes the method when running the computer executable instructions on the memory.
According to the method and the device provided by the embodiment of the invention, the plurality of virtual nodes are generated through the physical edge nodes, the hot spot data sets are distributed to the selected plurality of virtual nodes according to the load balancing strategy, the virtual nodes are used as cluster centers to build a plurality of cluster groups, and the hot spot data sets are copied into the plurality of cluster groups, so that a user terminal can connect any virtual node in the cluster groups nearby, and the utilization efficiency of the edge cloud resources is improved.
The embodiment of the present invention also provides a computer-readable storage medium having stored thereon computer-executable instructions for performing the method of the above-described embodiment.
The embodiment of the invention also provides a device which comprises a memory and a processor, wherein the memory stores computer executable instructions, and the processor realizes the method when running the computer executable instructions on the memory.
FIG. 5 is a schematic diagram of the hardware components of the device in one embodiment. It will be appreciated that figure 5 shows only a simplified design of the device. In practical applications, the apparatus may further include other necessary elements, including but not limited to any number of input/output systems, processors, controllers, memories, etc., and all apparatuses capable of implementing the big data management method of the embodiments of the present application are within the scope of protection of the present application.
The memory includes, but is not limited to, random access memory (random access memory, RAM), read-only memory (ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM), or portable read-only memory (compact disc read to only memory, CD to ROM) for the associated instructions and data.
The input system is used for inputting data and/or signals, and the output system is used for outputting data and/or signals. The output system and the input system may be separate devices or may be a single device.
A processor may include one or more processors, including for example one or more central processing units (central processing unit, CPU), which in the case of a CPU may be a single core CPU or a multi-core CPU. The processor may also include one or more special purpose processors, which may include GPUs, FPGAs, etc., for acceleration processing.
The memory is used to store program codes and data for the network device.
The processor is used to call the program code and data in the memory to perform the steps of the method embodiments described above. Reference may be made specifically to the description of the method embodiments, and no further description is given here.
In the several embodiments provided in this application, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the division of the unit is merely a logic function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. The coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, system or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable system. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a read-only memory (ROM), or a random-access memory (random access memory, RAM), or a magnetic medium such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium such as a digital versatile disk (digital versatile disc, DVD), or a semiconductor medium such as a Solid State Disk (SSD), or the like.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any equivalent modifications or substitutions will be apparent to those skilled in the art within the scope of the present application, and these modifications or substitutions should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.