CN113919420A - Data generation method, device, equipment and medium of computing cluster - Google Patents

Data generation method, device, equipment and medium of computing cluster Download PDF

Info

Publication number
CN113919420A
CN113919420A CN202111162285.5A CN202111162285A CN113919420A CN 113919420 A CN113919420 A CN 113919420A CN 202111162285 A CN202111162285 A CN 202111162285A CN 113919420 A CN113919420 A CN 113919420A
Authority
CN
China
Prior art keywords
data
processed
computing cluster
state
data generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111162285.5A
Other languages
Chinese (zh)
Inventor
李子佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pingan Payment Technology Service Co Ltd
Original Assignee
Pingan Payment Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pingan Payment Technology Service Co Ltd filed Critical Pingan Payment Technology Service Co Ltd
Priority to CN202111162285.5A priority Critical patent/CN113919420A/en
Publication of CN113919420A publication Critical patent/CN113919420A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the field of artificial intelligence, and provides a data generation method, a device, equipment and a medium of a computing cluster, which can construct the computing cluster according to equipment to be processed, and generates a topological structure, clearly reflects the relationship among each device to be processed in the computing cluster, generating training data according to the topological structure, the initial state of each device to be processed, each operation data and the final state of each device to be processed, training by using the training data to obtain a data generation model constructed based on a variational self-encoder, inputting the data to be processed into the data generation model, constructing a calculation cluster data set according to the output of the data generation model, the simulation of the data is realized through the model, and the data is automatically generated by combining the artificial intelligence means, so that the method is efficient and accurate, when a large amount of data are generated, adverse effects on normal work of a production line are effectively avoided. In addition, the invention also relates to a block chain technology, and the data generation model can be stored in the block chain node.

Description

Data generation method, device, equipment and medium of computing cluster
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a medium for generating data of a computing cluster.
Background
When fault detection, fault recovery and related system development are performed on a computing cluster, a significant difficulty is that there is no appropriate environment for development of related functions, testing, data acquisition and training of models.
For example: the method is generally adopted for carrying out research and development and testing of corresponding algorithms in a production line environment, and because the production line environment is relatively stable, used data are real but are relatively unilateral, most of the used data are system data in a safe or stable state, abnormal data quantity is relatively small, the method is difficult to be used for training of models such as fault detection or fault positioning and the like, and meanwhile, serious potential safety hazards can be caused, and even production line accidents are caused.
However, in the prior art, there is no method dedicated to computing cluster data simulation, which results in limited execution of task scenarios such as fault detection and fault recovery.
Disclosure of Invention
In view of the above, there is a need to provide a data generation method, apparatus, device and medium for a computing cluster, aiming at solving the simulation problem of computing cluster data.
A data generation method of a computing cluster, the data generation method of the computing cluster comprising:
acquiring equipment to be processed, and constructing a computing cluster according to the equipment to be processed;
generating a topology of the computing cluster;
acquiring an initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed;
generating training data according to the topological structure, the initial state of each device to be processed, each operation data and the final state of each device to be processed;
constructing an initial model based on a variational self-encoder;
training the initial model by using the training data to obtain a data generation model;
and acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
According to a preferred embodiment of the present invention, the generating the topology of the computing cluster includes:
acquiring a calling relation between the devices to be processed;
determining each device to be processed as a node;
constructing directed edges among the nodes according to the calling relationship among the devices to be processed;
and numbering each node according to a configuration sequence to obtain the topological structure.
According to a preferred embodiment of the present invention, the generating training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed includes:
coding each operation data to obtain an operation vector;
for each node in the topological structure, vectorizing each node according to the initial state of each device to be processed to generate an initial state signal of each node;
constructing a matrix based on the topological structure and the initial state signal of each node to obtain a set of first state data of the computing cluster; wherein each first state data corresponds to each operation vector in the set of first state data;
vectorizing each node according to the final state of each device to be processed to generate a final state signal of each node;
constructing a matrix based on the topological structure and the final state signal of each node to obtain a set of second state data of the computing cluster; wherein each second state data corresponds to each operation vector in the set of second state data;
dividing the first state data, the second state data and the same operation vector corresponding to the same operation vector into a group to obtain at least one data group;
and integrating the at least one data set to obtain the training data.
According to the preferred embodiment of the present invention, the constructing the initial model based on the variational self-encoder comprises:
acquiring output data of an encoder in the variational self-encoder, and acquiring a mean signal and a variance signal from the output data;
acquiring random noise;
fusing the mean signal and the variance signal with the random noise to obtain a noise code;
adding an operation coding layer in the variational self-encoder, and deploying the operation vector on the operation coding layer;
merging the noise codes and the operation vectors to obtain hidden vectors;
and inputting the implicit vector to a decoder in the variational self-encoder to obtain the initial model.
According to a preferred embodiment of the present invention, the training the initial model by using the training data to obtain a data generation model includes:
constructing a target loss function;
performing gradient descent training on the initial model by using the target loss function and based on the training data;
and when the value of the target loss function is smaller than or equal to a preset threshold value, stopping training, and determining the currently obtained model as the data generation model.
According to a preferred embodiment of the present invention, said constructing the objective loss function comprises:
the output loss component is calculated using the following equation:
Figure BDA0003290686600000031
wherein L isoutputRepresenting the loss of the output component, X ' representing second state data, X ', obtained from the training data 'outputSecond state data representing model output during the training process;
the hidden layer loss component is calculated using the following formula:
Figure BDA0003290686600000032
wherein L islatentRepresenting the hidden layer loss component, I representing a constant matrix, Zlog_varOutput data of variance output layer of model, ZmeanOutput data representing a mean output layer of the model;
and calculating the sum of the output loss component and the hidden layer loss component to obtain the target loss function.
According to a preferred embodiment of the present invention, before the inputting the data to be processed into the data generation model, the method further comprises:
identifying a current task scene;
when the current task scene needs to generate abnormal data, acquiring the devices to be processed which are backups of each other from the devices to be processed;
and configuring the initial state of the devices to be processed which are backups of each other as an exception.
A data generation apparatus of a computing cluster, the data generation apparatus of the computing cluster comprising:
the device comprises a construction unit, a processing unit and a processing unit, wherein the construction unit is used for acquiring a device to be processed and constructing a computing cluster according to the device to be processed;
a generating unit, configured to generate a topology structure of the computing cluster;
the acquisition unit is used for acquiring the initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster and the final state of each device to be processed after each operation data is executed;
the generating unit is further configured to generate training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed;
the construction unit is also used for constructing an initial model based on the variational self-encoder;
the training unit is used for training the initial model by using the training data to obtain a data generation model;
the construction unit is further used for acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
A computer device, the computer device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the data generation method of the computing cluster.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement a data generation method of the computing cluster.
It can be seen from the above technical solutions that, the present invention can obtain devices to be processed, construct a computing cluster according to the devices to be processed, generate a topology structure of the computing cluster, the generated topology structure can clearly reflect a relationship between each device to be processed in the computing cluster, collect an initial state of each device to be processed, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed from the computing cluster, generate training data according to the topology structure, the initial state of each device to be processed, each operation data, and the final state of each device to be processed, construct an initial model based on a variational self-encoder, train the initial model using the training data, obtain a data generation model, collect data to be processed, input the data to be processed to the data generation model, and a calculation cluster data group is constructed according to the output of the data generation model, the simulation of data can be realized through the model, and then the data is automatically generated by combining an artificial intelligence means, so that the method is efficient and accurate, and the adverse effect on the normal work of a production line is effectively avoided while a large amount of data is generated.
Drawings
FIG. 1 is a flow chart of a data generation method of a computing cluster according to a preferred embodiment of the present invention.
FIG. 2 is a functional block diagram of a data generating apparatus of a computing cluster according to a preferred embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a computer device according to a preferred embodiment of the present invention, which implements the data generation method for a computing cluster.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a data generation method of a computing cluster according to a preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The data generating method of the computing cluster is applied to one or more computer devices, which are devices capable of automatically performing numerical computation and/or information processing according to preset or stored instructions, and the hardware of the computer devices includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive web Television (IPTV), an intelligent wearable device, and the like.
The computer device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.
Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The Network in which the computer device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, acquiring the equipment to be processed, and constructing a computing cluster according to the equipment to be processed.
In at least one embodiment of the present invention, the device to be processed may include a plurality of virtual machines or a plurality of servers, which is not limited by the present invention.
In at least one embodiment of the present invention, the devices to be processed are grouped into a group, so as to obtain the computing cluster.
S11, generating the topological structure of the computing cluster.
It is understood that there is a certain calling relationship between the devices to be processed, and thus, the computing cluster can be represented in the form of graph data.
In at least one embodiment of the present invention, the generating the topology of the computing cluster includes:
acquiring a calling relation between the devices to be processed;
determining each device to be processed as a node;
constructing directed edges among the nodes according to the calling relationship among the devices to be processed;
and numbering each node according to a configuration sequence to obtain the topological structure.
In this embodiment, the configuration sequence may be a work serial number of the to-be-processed device, or may be configured by a user, which is not limited in the present invention.
For example: the generated topology can be in a matrix form, and is recorded as:
Figure BDA0003290686600000071
wherein N is the total number of nodes.
Through the embodiment, the generated topological structure can clearly reflect the relationship among the devices to be processed in the computing cluster.
S12, collecting the initial state of each device to be processed, each operation data executed to the computing cluster, and the final state of each device to be processed after each operation data is executed from the computing cluster.
In at least one embodiment of the present invention, the initial state refers to a state of a corresponding device to be processed before a specified operation is not performed, such as: the utilization rate of a Central Processing Unit (CPU) of the corresponding device to be processed, the utilization rate of a memory, whether the device is in a boot state, and the like.
In at least one embodiment of the present invention, each operation data executed on the computing cluster refers to an operation on the computing cluster, such as: starting up, shutting down, stopping the calculation task and the like.
In at least one embodiment of the present invention, the final state refers to a state of the corresponding device to be processed after the specified operation is performed, such as: and after the starting operation is executed, the utilization rate of the CPU, the utilization rate of the memory, whether the equipment to be processed is in the starting state or not and other indexes are obtained.
And S13, generating training data according to the topological structure, the initial state of each device to be processed, each operation data and the final state of each device to be processed.
In at least one embodiment of the present invention, the generating training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed includes:
coding each operation data to obtain an operation vector;
for each node in the topological structure, vectorizing each node according to the initial state of each device to be processed to generate an initial state signal of each node;
constructing a matrix based on the topological structure and the initial state signal of each node to obtain a set of first state data of the computing cluster; wherein each first state data corresponds to each operation vector in the set of first state data;
vectorizing each node according to the final state of each device to be processed to generate a final state signal of each node;
constructing a matrix based on the topological structure and the final state signal of each node to obtain a set of second state data of the computing cluster; wherein each second state data corresponds to each operation vector in the set of second state data;
dividing the first state data, the second state data and the same operation vector corresponding to the same operation vector into a group to obtain at least one data group;
and integrating the at least one data set to obtain the training data.
For example, when vectorization processing is performed on each node according to the initial state of each device to be processed, when it is determined that the CPU utilization of the node a is 20%, it may be recorded as 0.2; when the memory usage rate of the node a is 50%, it may be recorded as 0.5; when the node A is in a power-on state, the node A can be marked as 1; when the node a is in the power-off state, it may be marked as 0. And further transversely splicing the numerical values corresponding to the states to obtain the initial state signal of the node A.
Further, in the case of the number and topology determination of each node of the computing cluster, the first state data of the whole cluster may be represented as a set of initial state signals on all nodes, and recorded as a matrix X ∈ RN×DWhere D is the signal dimension, for example: when there are 100 nodes and each node corresponds to 10 initial states, the value of the dimension D of the matrix corresponding to the first state data is 100 × 10. In the matrix, the ith row represents the initial state signal of the ith node.
In this embodiment, each operation data may be encoded by using a one-hot encoding algorithm, so as to obtain the operation vector.
For example: the code to start up is denoted as 001, the code to shut down is denoted as 010, and the code to stop the computation task is denoted as 001.
Further, the operations performed on the entire cluster may be represented as a set of operations on all nodes, denoted as the matrix A ∈ RN×KWherein K is the total operating class. In the matrix, the ith row and the jth column indicate whether a jth operation is performed on the ith node.
In this embodiment, after the corresponding operation is performed on the computation cluster, the second state data of the computation cluster can be obtained and recorded as a matrix X' e RN×D
Further, the first state data, the second state data and the same operation vector corresponding to the same operation vector are divided into a group, and at this time, the obtained group of data may be referred to as (X, a, X '), where X represents the first state data, a represents the operation vector, and X' represents the second state data.
S14, an initial model is constructed based on a Variational Auto-Encoder (VAE).
Specifically, the constructing of the initial model based on the variational self-encoder includes:
acquiring output data of an encoder in the variational self-encoder, and acquiring a mean signal and a variance signal from the output data;
acquiring random noise;
fusing the mean signal and the variance signal with the random noise to obtain a noise code;
adding an operation coding layer in the variational self-encoder, and deploying the operation vector on the operation coding layer;
merging the noise codes and the operation vectors to obtain hidden vectors;
and inputting the implicit vector to a decoder in the variational self-encoder to obtain the initial model.
Specifically, in the initial model, an encoder and a decoder in the variational self-encoder may use a graph convolution neural network as a basic structure.
Specifically, in the encoder, taking a graph convolution neural network with a single hidden layer as an example, the main structure of the graph convolution neural network can be represented as follows:
an input layer: inputting first state data X of the computing cluster;
hiding the layer:
Figure BDA0003290686600000101
and (3) average value output layer:
Figure BDA0003290686600000102
variance output layer:
Figure BDA0003290686600000103
wherein, W1,W2,W3To be a weight coefficient, σ (-) is an activation function.
Further, superimposing the noise signal and the operation signal:
gaussian noise epsilon-N (0,1) is obtained by sampling from Gaussian distribution and is used as the random noise, and the random noise and the average value signal Z output by the encodermeanVariance, varianceSignal Zlog_varAnd performing fusion to obtain a noise code as follows:
Figure BDA0003290686600000104
combining the noise code with the operation vector A ∈ RN×KObtaining a hidden vector Z representing the system state-the encoding of the system operation:
Z=[Znoise|A],
Figure BDA0003290686600000105
further, in the decoder, taking a graph convolution neural network with a single hidden layer as an example, the main structure of the graph convolution neural network can be represented as follows:
an input layer: a hidden vector Z;
hiding the layer:
Figure BDA0003290686600000111
an output layer:
Figure BDA0003290686600000112
wherein, W4,W5Are weight coefficients.
In particular, W1,W2,W3,W4,W5Random initialization may be performed.
In the above embodiment, an operation coding layer is added on the basis of the original variational self-encoder, and the constructed initial model can be fused with the actual operation by changing the structure of the variational self-encoder.
And S15, training the initial model by using the training data to obtain a data generation model.
In at least one embodiment of the present invention, the training the initial model by using the training data to obtain a data generation model includes:
constructing a target loss function;
performing gradient descent training on the initial model by using the target loss function and based on the training data;
and when the value of the target loss function is smaller than or equal to a preset threshold value, stopping training, and determining the currently obtained model as the data generation model.
The preset threshold value can be configured in a user-defined mode.
Specifically, the constructing the target loss function includes:
the output loss component is calculated using the following equation:
Figure BDA0003290686600000113
wherein L isoutputRepresenting the loss of the output component, X ' representing second state data, X ', obtained from the training data 'outputSecond state data representing model output during the training process;
the hidden layer loss component is calculated using the following formula:
Figure BDA0003290686600000114
wherein L islatentRepresenting the hidden layer loss component, I representing a constant matrix, Zlog_varOutput data of variance output layer of model, ZmeanOutput data representing a mean output layer of the model;
and calculating the sum of the output loss component and the hidden layer loss component to obtain the target loss function.
For example: the objective loss function can be expressed as: l ═ Loutput+Llatent
Wherein I is 1 for each element, and Zlog_varAnd ZmeanConstant matrices with the same dimensions.
In the above formula, computing SUM means summing all elements of the matrix; square (-) denotes the element-by-element squaring of the matrix, respectively, and exp (-) denotes the element-by-element computation of the exponential function of the matrix.
Further, when the training is carried out until the loss function is kept lower than the preset threshold value, the model training is finished, and the weight coefficient W of the optimized model is obtained1,W2,W3,W4,W5
By the embodiment, the output loss and the hidden layer loss can be comprehensively considered when the model is trained, so that the accuracy of the trained model is higher.
And S16, acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
In at least one embodiment of the present invention, data may be randomly acquired from the training data as the data to be processed, or data may be collected from a production line environment as the data to be processed.
Specifically, the inputting the data to be processed into the data generation model, and constructing a computation cluster data group according to the output of the data generation model includes:
after model training is complete, the generative model may be used to generate relevant data. The first state data may be randomly extracted from the collected training data set, or collected in real time from the real production line environment, and taken as the initial state of the simulation system (i.e. the initial model) and recorded as the matrix Xorigin
Simulating the initial state X of the systemoriginInputting the signal into an encoder, and performing forward propagation layer by layer to obtain a system state code, namely a mean value signal ZmeanSum variance signal Zlog_var
Wherein the hidden layer outputs
Figure BDA0003290686600000121
Mean output layer output
Figure BDA0003290686600000122
Variance output layer output
Figure BDA0003290686600000123
Gaussian noise signals epsilon-N (0,1) are collected from normal distribution to form a noisy system state signal
Figure BDA0003290686600000124
And with the operation vector A to be testedtestCombining to obtain a superposed signal Z ═ Znoise|Atest]。
Inputting a system state-system operation signal Z to a decoder to obtain operated second state data X'generate. A set of simulation data is obtained at the same time and is recorded as (X)origin,Atest,X′generate)。
Figure BDA0003290686600000131
Figure BDA0003290686600000132
Repeating the above steps to generate a series of data sets denoted as { (X)origin,Atest,X′generate) And realizing data simulation and data generation.
Through above-mentioned embodiment, can realize the analog simulation to the data through the model, and then combine artificial intelligence means automatic generation data, it is high-efficient and accurate, when generating a large amount of data, effectively avoided causing adverse effect to producing the normal work of line.
In at least one embodiment of the invention, before the inputting the data to be processed into the data generation model, the method further comprises:
identifying a current task scene;
when the current task scene needs to generate abnormal data, acquiring the devices to be processed which are backups of each other from the devices to be processed;
and configuring the initial state of the devices to be processed which are backups of each other as an exception.
For example: when the device a, the device B, and the device C are backup devices, the abnormal data is generated only when all the three are abnormal, and therefore, if the initial states of the device a, the device B, and the device C are all configured to be abnormal (if all the devices are in a shutdown state), the simulation data generated by the data generation model is the abnormal data.
In the above embodiment, sufficient abnormal data is generated through the model so as to be used for carrying out fault detection, fault recovery and related system research and development subsequently, no harm or risk is brought to the real production line environment, relatively abundant and comprehensive abnormal data can be obtained, the fault recovery efficiency of the computing cluster is further improved, the fault locating and recovery time is shortened, and the system availability is improved.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the data generation model may be stored in the blockchain node.
It can be seen from the above technical solutions that, the present invention can obtain devices to be processed, construct a computing cluster according to the devices to be processed, generate a topology structure of the computing cluster, the generated topology structure can clearly reflect a relationship between each device to be processed in the computing cluster, collect an initial state of each device to be processed, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed from the computing cluster, generate training data according to the topology structure, the initial state of each device to be processed, each operation data, and the final state of each device to be processed, construct an initial model based on a variational self-encoder, train the initial model using the training data, obtain a data generation model, collect data to be processed, input the data to be processed to the data generation model, and a calculation cluster data group is constructed according to the output of the data generation model, the simulation of data can be realized through the model, and then the data is automatically generated by combining an artificial intelligence means, so that the method is efficient and accurate, and the adverse effect on the normal work of a production line is effectively avoided while a large amount of data is generated.
Fig. 2 is a functional block diagram of a data generating apparatus of a computing cluster according to a preferred embodiment of the present invention. The data generating device 11 of the computing cluster includes a building unit 110, a generating unit 111, an acquiring unit 112, and a training unit 113. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
The construction unit 110 obtains the device to be processed, and constructs a computing cluster according to the device to be processed.
In at least one embodiment of the present invention, the device to be processed may include a plurality of virtual machines or a plurality of servers, which is not limited by the present invention.
In at least one embodiment of the present invention, the devices to be processed are grouped into a group, so as to obtain the computing cluster.
The generating unit 111 generates a topology of the computing cluster.
It is understood that there is a certain calling relationship between the devices to be processed, and thus, the computing cluster can be represented in the form of graph data.
In at least one embodiment of the present invention, the generating unit 111 generates the topology of the computing cluster, including:
acquiring a calling relation between the devices to be processed;
determining each device to be processed as a node;
constructing directed edges among the nodes according to the calling relationship among the devices to be processed;
and numbering each node according to a configuration sequence to obtain the topological structure.
In this embodiment, the configuration sequence may be a work serial number of the to-be-processed device, or may be configured by a user, which is not limited in the present invention.
For example: the generated topology can be in a matrix form, and is recorded as:
Figure BDA0003290686600000151
wherein N is the total number of nodes.
Through the embodiment, the generated topological structure can clearly reflect the relationship among the devices to be processed in the computing cluster.
The collecting unit 112 collects an initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed.
In at least one embodiment of the present invention, the initial state refers to a state of a corresponding device to be processed before a specified operation is not performed, such as: the utilization rate of a Central Processing Unit (CPU) of the corresponding device to be processed, the utilization rate of a memory, whether the device is in a boot state, and the like.
In at least one embodiment of the present invention, each operation data executed on the computing cluster refers to an operation on the computing cluster, such as: starting up, shutting down, stopping the calculation task and the like.
In at least one embodiment of the present invention, the final state refers to a state of the corresponding device to be processed after the specified operation is performed, such as: and after the starting operation is executed, the utilization rate of the CPU, the utilization rate of the memory, whether the equipment to be processed is in the starting state or not and other indexes are obtained.
The generating unit 111 generates training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed.
In at least one embodiment of the present invention, the generating unit 111 generates training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed, including:
coding each operation data to obtain an operation vector;
for each node in the topological structure, vectorizing each node according to the initial state of each device to be processed to generate an initial state signal of each node;
constructing a matrix based on the topological structure and the initial state signal of each node to obtain a set of first state data of the computing cluster; wherein each first state data corresponds to each operation vector in the set of first state data;
vectorizing each node according to the final state of each device to be processed to generate a final state signal of each node;
constructing a matrix based on the topological structure and the final state signal of each node to obtain a set of second state data of the computing cluster; wherein each second state data corresponds to each operation vector in the set of second state data;
dividing the first state data, the second state data and the same operation vector corresponding to the same operation vector into a group to obtain at least one data group;
and integrating the at least one data set to obtain the training data.
For example, when vectorization processing is performed on each node according to the initial state of each device to be processed, when it is determined that the CPU utilization of the node a is 20%, it may be recorded as 0.2; when the memory usage rate of the node a is 50%, it may be recorded as 0.5; when the node A is in a power-on state, the node A can be marked as 1; when the node a is in the power-off state, it may be marked as 0. And further transversely splicing the numerical values corresponding to the states to obtain the initial state signal of the node A.
Further, in the case of the number and topology determination of each node of the computing cluster, the first state data of the whole cluster may be represented as a set of initial state signals on all nodes, and recorded as a matrix X ∈ RN×DWhere D is the signal dimension, for example: when there are 100 nodes, each node corresponding to 10 initial states, then the saidThe dimension D of the matrix corresponding to the first state data takes a value of 100 × 10. In the matrix, the ith row represents the initial state signal of the ith node.
In this embodiment, each operation data may be encoded by using a one-hot encoding algorithm, so as to obtain the operation vector.
For example: the code to start up is denoted as 001, the code to shut down is denoted as 010, and the code to stop the computation task is denoted as 001.
Further, the operations performed on the entire cluster may be represented as a set of operations on all nodes, denoted as the matrix A ∈ RN×KWherein K is the total operating class. In the matrix, the ith row and the jth column indicate whether a jth operation is performed on the ith node.
In this embodiment, after the corresponding operation is performed on the computation cluster, the second state data of the computation cluster can be obtained and recorded as a matrix X' e RN×D
Further, the first state data, the second state data and the same operation vector corresponding to the same operation vector are divided into a group, and at this time, the obtained group of data may be referred to as (X, a, X '), where X represents the first state data, a represents the operation vector, and X' represents the second state data.
The construction unit 110 constructs an initial model based on a Variational Auto-Encoder (VAE).
Specifically, the constructing unit 110 constructs the initial model based on the variational self-encoder, including:
acquiring output data of an encoder in the variational self-encoder, and acquiring a mean signal and a variance signal from the output data;
acquiring random noise;
fusing the mean signal and the variance signal with the random noise to obtain a noise code;
adding an operation coding layer in the variational self-encoder, and deploying the operation vector on the operation coding layer;
merging the noise codes and the operation vectors to obtain hidden vectors;
and inputting the implicit vector to a decoder in the variational self-encoder to obtain the initial model.
Specifically, in the initial model, an encoder and a decoder in the variational self-encoder may use a graph convolution neural network as a basic structure.
Specifically, in the encoder, taking a graph convolution neural network with a single hidden layer as an example, the main structure of the graph convolution neural network can be represented as follows:
an input layer: inputting first state data X of the computing cluster;
hiding the layer:
Figure BDA0003290686600000181
and (3) average value output layer:
Figure BDA0003290686600000182
variance output layer:
Figure BDA0003290686600000183
wherein, W1,W2,W3To be a weight coefficient, σ (-) is an activation function.
Further, superimposing the noise signal and the operation signal:
gaussian noise epsilon-N (0,1) is obtained by sampling from Gaussian distribution and is used as the random noise, and the random noise and the average value signal Z output by the encodermeanVariance signal Zlog_varAnd performing fusion to obtain a noise code as follows:
Figure BDA0003290686600000184
combining the noise code with the operation vector A ∈ RN×KObtaining a hidden vector Z representing the system state-the encoding of the system operation:
Z=[Znoise|A],
Figure BDA0003290686600000185
further, in the decoder, taking a graph convolution neural network with a single hidden layer as an example, the main structure of the graph convolution neural network can be represented as follows:
an input layer: a hidden vector Z;
hiding the layer:
Figure BDA0003290686600000186
an output layer:
Figure BDA0003290686600000187
wherein, W4,W5Are weight coefficients.
In particular, W1,W2,W3,W4,W5Random initialization may be performed.
In the above embodiment, an operation coding layer is added on the basis of the original variational self-encoder, and the constructed initial model can be fused with the actual operation by changing the structure of the variational self-encoder.
The training unit 113 trains the initial model using the training data to obtain a data generation model.
In at least one embodiment of the present invention, the training unit 113 trains the initial model by using the training data, and obtaining a data generation model includes:
constructing a target loss function;
performing gradient descent training on the initial model by using the target loss function and based on the training data;
and when the value of the target loss function is smaller than or equal to a preset threshold value, stopping training, and determining the currently obtained model as the data generation model.
The preset threshold value can be configured in a user-defined mode.
Specifically, the constructing the target loss function includes:
the output loss component is calculated using the following equation:
Figure BDA0003290686600000191
wherein L isoutputRepresenting the loss of the output component, X ' representing second state data, X ', obtained from the training data 'outputSecond state data representing model output during the training process;
the hidden layer loss component is calculated using the following formula:
Figure BDA0003290686600000192
wherein L islatentRepresenting the hidden layer loss component, I representing a constant matrix, Zlog_varOutput data of variance output layer of model, ZmeanOutput data representing a mean output layer of the model;
and calculating the sum of the output loss component and the hidden layer loss component to obtain the target loss function.
For example: the objective loss function can be expressed as: l ═ Loutput+Llatent
Wherein I is 1 for each element, and Zlog_varAnd ZmeanConstant matrices with the same dimensions.
In the above formula, computing SUM means summing all elements of the matrix; square (-) denotes the element-by-element squaring of the matrix, respectively, and exp (-) denotes the element-by-element computation of the exponential function of the matrix.
Further, when the training is carried out until the loss function is kept lower than the preset threshold value, the model training is finished, and the weight coefficient W of the optimized model is obtained1,W2,W3,W4,W5
By the embodiment, the output loss and the hidden layer loss can be comprehensively considered when the model is trained, so that the accuracy of the trained model is higher.
The construction unit 110 collects data to be processed, inputs the data to be processed to the data generation model, and constructs a calculation cluster data set according to the output of the data generation model.
In at least one embodiment of the present invention, data may be randomly acquired from the training data as the data to be processed, or data may be collected from a production line environment as the data to be processed.
Specifically, the constructing unit 110 inputs the data to be processed into the data generation model, and constructs a calculation cluster data group according to the output of the data generation model, including:
after model training is complete, the generative model may be used to generate relevant data. The first state data may be randomly extracted from the collected training data set, or collected in real time from the real production line environment, and taken as the initial state of the simulation system (i.e. the initial model) and recorded as the matrix Xorigin
Simulating the initial state X of the systemoriginInputting the signal into an encoder, and performing forward propagation layer by layer to obtain a system state code, namely a mean value signal ZmeanSum variance signal Zlog_var
Wherein the hidden layer outputs
Figure BDA0003290686600000201
Mean output layer output
Figure BDA0003290686600000202
Variance output layer output
Figure BDA0003290686600000203
Gaussian noise signals epsilon-N (0,1) are collected from normal distribution to form a noisy system state signal
Figure BDA0003290686600000204
And with the operation vector A to be testedtestCombining to obtain a superposed signal Z ═ Znoise|Atest]。
Inputting a system state-system operation signal Z to a decoder to obtain operated second state data X'generate. A set of simulation data is obtained at the same time and is recorded as (X)origin,Atest,X′generate)。
Figure BDA0003290686600000205
Figure BDA0003290686600000206
Repeating the above steps to generate a series of data sets denoted as { (X)origin,Atest,X′generate) And realizing data simulation and data generation.
Through above-mentioned embodiment, can realize the analog simulation to the data through the model, and then combine artificial intelligence means automatic generation data, it is high-efficient and accurate, when generating a large amount of data, effectively avoided causing adverse effect to producing the normal work of line.
In at least one embodiment of the present invention, prior to said inputting said data to be processed into said data generation model, identifying a current task scenario;
when the current task scene needs to generate abnormal data, acquiring the devices to be processed which are backups of each other from the devices to be processed;
and configuring the initial state of the devices to be processed which are backups of each other as an exception.
For example: when the device a, the device B, and the device C are backup devices, the abnormal data is generated only when all the three are abnormal, and therefore, if the initial states of the device a, the device B, and the device C are all configured to be abnormal (if all the devices are in a shutdown state), the simulation data generated by the data generation model is the abnormal data.
In the above embodiment, sufficient abnormal data is generated through the model so as to be used for carrying out fault detection, fault recovery and related system research and development subsequently, no harm or risk is brought to the real production line environment, relatively abundant and comprehensive abnormal data can be obtained, the fault recovery efficiency of the computing cluster is further improved, the fault locating and recovery time is shortened, and the system availability is improved.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the data generation model may be stored in the blockchain node.
It can be seen from the above technical solutions that, the present invention can obtain devices to be processed, construct a computing cluster according to the devices to be processed, generate a topology structure of the computing cluster, the generated topology structure can clearly reflect a relationship between each device to be processed in the computing cluster, collect an initial state of each device to be processed, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed from the computing cluster, generate training data according to the topology structure, the initial state of each device to be processed, each operation data, and the final state of each device to be processed, construct an initial model based on a variational self-encoder, train the initial model using the training data, obtain a data generation model, collect data to be processed, input the data to be processed to the data generation model, and a calculation cluster data group is constructed according to the output of the data generation model, the simulation of data can be realized through the model, and then the data is automatically generated by combining an artificial intelligence means, so that the method is efficient and accurate, and the adverse effect on the normal work of a production line is effectively avoided while a large amount of data is generated.
Fig. 3 is a schematic structural diagram of a computer device according to a preferred embodiment of the present invention, which implements a data generation method for a computing cluster.
The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as a data generating program of a computing cluster, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the computer device 1, and does not constitute a limitation to the computer device 1, the computer device 1 may have a bus-type structure or a star-shaped structure, the computer device 1 may further include more or less other hardware or software than those shown, or different component arrangements, for example, the computer device 1 may further include an input and output device, a network access device, etc.
It should be noted that the computer device 1 is only an example, and other electronic products that are currently available or may come into existence in the future, such as electronic products that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, for example a removable hard disk of the computer device 1. The memory 12 may also be an external storage device of the computer device 1 in other embodiments, such as a plug-in removable hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 may be used not only for storing application software installed in the computer device 1 and various types of data, such as codes of data generation programs of a computing cluster, etc., but also for temporarily storing data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects various components of the entire computer device 1 by using various interfaces and lines, and executes various functions and processes data of the computer device 1 by running or executing programs or modules (for example, executing a data generation program of a computing cluster, etc.) stored in the memory 12 and calling data stored in the memory 12.
The processor 13 executes the operating system of the computer device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in the above-mentioned data generation method embodiments of each computing cluster, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the computer device 1. For example, the computer program may be segmented into a construction unit 110, a generation unit 111, an acquisition unit 112, a training unit 113.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the data generation method of the computing cluster according to the embodiments of the present invention.
The integrated modules/units of the computer device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), random-access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one line is shown in FIG. 3, but this does not mean only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the computer device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The computer device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the computer device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the computer device 1 and other computer devices.
Optionally, the computer device 1 may further comprise a user interface, which may be a Display (Display), an input unit, such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the computer device 1 and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 shows only the computer device 1 with the components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the computer device 1 and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
With reference to fig. 1, the memory 12 in the computer device 1 stores a plurality of instructions to implement a data generation method of a computing cluster, and the processor 13 can execute the plurality of instructions to implement:
acquiring equipment to be processed, and constructing a computing cluster according to the equipment to be processed;
generating a topology of the computing cluster;
acquiring an initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed;
generating training data according to the topological structure, the initial state of each device to be processed, each operation data and the final state of each device to be processed;
constructing an initial model based on a variational self-encoder;
training the initial model by using the training data to obtain a data generation model;
and acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A data generation method of a computing cluster, the data generation method of the computing cluster comprising:
acquiring equipment to be processed, and constructing a computing cluster according to the equipment to be processed;
generating a topology of the computing cluster;
acquiring an initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster, and a final state of each device to be processed after each operation data is executed;
generating training data according to the topological structure, the initial state of each device to be processed, each operation data and the final state of each device to be processed;
constructing an initial model based on a variational self-encoder;
training the initial model by using the training data to obtain a data generation model;
and acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
2. The data generation method of a computing cluster of claim 1, wherein the generating the topology of the computing cluster comprises:
acquiring a calling relation between the devices to be processed;
determining each device to be processed as a node;
constructing directed edges among the nodes according to the calling relationship among the devices to be processed;
and numbering each node according to a configuration sequence to obtain the topological structure.
3. The method of data generation for a computing cluster of claim 1, wherein the generating training data from the topology, the initial state of each device to be processed, each operational data, and the final state of each device to be processed comprises:
coding each operation data to obtain an operation vector;
for each node in the topological structure, vectorizing each node according to the initial state of each device to be processed to generate an initial state signal of each node;
constructing a matrix based on the topological structure and the initial state signal of each node to obtain a set of first state data of the computing cluster; wherein each first state data corresponds to each operation vector in the set of first state data;
vectorizing each node according to the final state of each device to be processed to generate a final state signal of each node;
constructing a matrix based on the topological structure and the final state signal of each node to obtain a set of second state data of the computing cluster; wherein each second state data corresponds to each operation vector in the set of second state data;
dividing the first state data, the second state data and the same operation vector corresponding to the same operation vector into a group to obtain at least one data group;
and integrating the at least one data set to obtain the training data.
4. The data generation method of a computing cluster of claim 3, wherein said constructing an initial model based on a variational self-encoder comprises:
acquiring output data of an encoder in the variational self-encoder, and acquiring a mean signal and a variance signal from the output data;
acquiring random noise;
fusing the mean signal and the variance signal with the random noise to obtain a noise code;
adding an operation coding layer in the variational self-encoder, and deploying the operation vector on the operation coding layer;
merging the noise codes and the operation vectors to obtain hidden vectors;
and inputting the implicit vector to a decoder in the variational self-encoder to obtain the initial model.
5. The method of data generation for a computing cluster of claim 1, wherein said training the initial model with the training data to obtain a data generation model comprises:
constructing a target loss function;
performing gradient descent training on the initial model by using the target loss function and based on the training data;
and when the value of the target loss function is smaller than or equal to a preset threshold value, stopping training, and determining the currently obtained model as the data generation model.
6. The data generation method of a computing cluster of claim 5, wherein said constructing an objective loss function comprises:
the output loss component is calculated using the following equation:
Figure FDA0003290686590000031
wherein L isoutputRepresenting the loss of said output component, X' representing the value obtained from said training dataSecond State data, X'outputSecond state data representing model output during the training process;
the hidden layer loss component is calculated using the following formula:
Figure FDA0003290686590000032
wherein L islatentRepresenting the hidden layer loss component, I representing a constant matrix, Zlog_varOutput data of variance output layer of model, ZmeanOutput data representing a mean output layer of the model;
and calculating the sum of the output loss component and the hidden layer loss component to obtain the target loss function.
7. The data generation method of a compute cluster of claim 1, wherein prior to said inputting the data to be processed to the data generation model, the method further comprises:
identifying a current task scene;
when the current task scene needs to generate abnormal data, acquiring the devices to be processed which are backups of each other from the devices to be processed;
and configuring the initial state of the devices to be processed which are backups of each other as an exception.
8. A data generation apparatus of a computing cluster, the data generation apparatus of the computing cluster comprising:
the device comprises a construction unit, a processing unit and a processing unit, wherein the construction unit is used for acquiring a device to be processed and constructing a computing cluster according to the device to be processed;
a generating unit, configured to generate a topology structure of the computing cluster;
the acquisition unit is used for acquiring the initial state of each device to be processed from the computing cluster, each operation data executed on the computing cluster and the final state of each device to be processed after each operation data is executed;
the generating unit is further configured to generate training data according to the topology, the initial state of each device to be processed, each operation data, and the final state of each device to be processed;
the construction unit is also used for constructing an initial model based on the variational self-encoder;
the training unit is used for training the initial model by using the training data to obtain a data generation model;
the construction unit is further used for acquiring data to be processed, inputting the data to be processed into the data generation model, and constructing a calculation cluster data group according to the output of the data generation model.
9. A computer device, characterized in that the computer device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the data generation method of a computing cluster according to any of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executable by a processor in a computer device to implement a data generation method of a computing cluster according to any one of claims 1 to 7.
CN202111162285.5A 2021-09-30 2021-09-30 Data generation method, device, equipment and medium of computing cluster Pending CN113919420A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111162285.5A CN113919420A (en) 2021-09-30 2021-09-30 Data generation method, device, equipment and medium of computing cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111162285.5A CN113919420A (en) 2021-09-30 2021-09-30 Data generation method, device, equipment and medium of computing cluster

Publications (1)

Publication Number Publication Date
CN113919420A true CN113919420A (en) 2022-01-11

Family

ID=79237621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111162285.5A Pending CN113919420A (en) 2021-09-30 2021-09-30 Data generation method, device, equipment and medium of computing cluster

Country Status (1)

Country Link
CN (1) CN113919420A (en)

Similar Documents

Publication Publication Date Title
CN111949708B (en) Multi-task prediction method, device, equipment and medium based on time sequence feature extraction
CN113887408B (en) Method, device, equipment and storage medium for detecting activated face video
CN114997263B (en) Method, device, equipment and storage medium for analyzing training rate based on machine learning
CN114511038A (en) False news detection method and device, electronic equipment and readable storage medium
CN111339072A (en) User behavior based change value analysis method and device, electronic device and medium
CN114185776A (en) Big data point burying method, device, equipment and medium for application program
CN111950707A (en) Behavior prediction method, apparatus, device and medium based on behavior co-occurrence network
KR20180108738A (en) Strategic impromptu design for adaptive resilience
CN116778527A (en) Human body model construction method, device, equipment and storage medium
CN116630712A (en) Information classification method and device based on modal combination, electronic equipment and medium
CN116401602A (en) Event detection method, device, equipment and computer readable medium
CN116680580A (en) Information matching method and device based on multi-mode training, electronic equipment and medium
CN113919420A (en) Data generation method, device, equipment and medium of computing cluster
CN114862140A (en) Behavior analysis-based potential evaluation method, device, equipment and storage medium
CN114385453A (en) Database cluster exception handling method, device, equipment and medium
CN114881103A (en) Countermeasure sample detection method and device based on universal disturbance sticker
CN111859985B (en) AI customer service model test method and device, electronic equipment and storage medium
CN113706292B (en) Credit card testing method, device, equipment and medium based on virtual data
CN116225789B (en) Transaction system backup capability detection method, device, equipment and medium
CN116306591B (en) Flow form generation method, device, equipment and medium
CN113687834B (en) Distributed system node deployment method, device, equipment and medium
CN116702906A (en) NLP model inference acceleration method, device, equipment and medium based on elastic loading
CN112651778B (en) User behavior prediction method, device, equipment and medium
CN116414366B (en) Middleware interface generation method, device, equipment and medium
CN116934263B (en) Product batch admittance method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination