CN116361662B

CN116361662B - Training method of machine learning model and performance prediction method of quantum network equipment

Info

Publication number: CN116361662B
Application number: CN202310626553.7A
Authority: CN
Inventors: 王嘉诚; 张少仲; 张栩
Original assignee: Zhongcheng Hualong Computer Technology Co Ltd
Current assignee: Zhongcheng Hualong Computer Technology Co Ltd
Priority date: 2023-05-31
Filing date: 2023-05-31
Publication date: 2023-08-15
Anticipated expiration: 2043-05-31
Also published as: CN116361662A

Abstract

The invention provides a training method of a machine learning model and a performance prediction method of quantum network equipment, which relate to the technical field of data processing, and the method comprises the following steps: constructing a machine learning model, the machine learning model comprising: an input layer, an encoding layer, and a decoding layer; the input layer is provided with a data processing unit, and the coding layer comprises a change relation attention sub-layer, a correlation relation attention sub-layer and an evaluation relation attention sub-layer; forming a plurality of sample pairs based on performance values of the performance parameters of the quantum network device at each historical moment; the sample pair includes: setting a plurality of performance values of continuous historical moment performance parameters; the performance values of the performance parameters of the sample pair at the latest historical moment are output samples, and the performance values of the performance parameters of the sample pair at other historical moments are input samples; the machine learning model is trained using a plurality of samples. According to the scheme, the accurate prediction of the performance value of the quantum network equipment can be realized.

Description

Training method of machine learning model and performance prediction method of quantum network equipment

Technical Field

The embodiment of the invention relates to the technical field of data processing, in particular to a training method of a machine learning model and a performance prediction method of quantum network equipment.

Background

With the rapid development of quantum communication technology, quantum network devices are also generated. The performance of the quantum network device determines the performance of the quantum communication system, so that it is necessary to make specific plans (such as disk expansion) for future foreseeable performance conditions by predicting the future performance conditions of the quantum network device, and the performance stability of the quantum communication system can be ensured.

At present, only historical operation data and current operation data generated by the quantum network equipment are displayed and used, performance values of performance indexes of the quantum network equipment are not predicted, the data utilization rate is low, and the stability of a quantum communication system cannot be guaranteed.

Disclosure of Invention

The embodiment of the invention provides a method and a device for predicting the performance of quantum network equipment based on a multi-prediction model, which can realize the accurate prediction of the performance value of the quantum network equipment, thereby ensuring the stability of a quantum communication system.

In a first aspect, an embodiment of the present invention provides a training method for a machine learning model, including:

building a machine learning model, the machine learning model comprising: an input layer, an encoding layer, and a decoding layer; the input layer is provided with a data processing unit, and the coding layer comprises a change relation attention sub-layer, a correlation relation attention sub-layer and an evaluation relation attention sub-layer;

forming a plurality of sample pairs based on performance values of the performance parameters of the quantum network device at each historical moment; the sample pair includes: setting a plurality of performance values of continuous historical moment performance parameters; the performance values of the performance parameters of the sample pair at the latest historical moment are output samples, and the performance values of the performance parameters of the sample pair at other historical moments are input samples;

training the machine learning model with the plurality of samples;

in the machine learning model, the data processing unit is used for processing an input sample, inputting a change relation matrix of each performance parameter obtained by processing in a time dimension into the change relation attention sub-layer, inputting a correlation relation matrix between the performance parameters obtained by processing into the correlation relation attention sub-layer, and inputting an evaluation relation matrix between the performance parameters and performance evaluation results into the evaluation relation attention sub-layer; the decoding layer is used for fusing the intermediate codes output by each attention sub-layer to obtain an attention coding sequence, and outputting the performance values of the performance parameters by using the attention coding sequence.

In a second aspect, an embodiment of the present invention further provides a method for predicting performance of a quantum network device, including:

acquiring performance values of performance parameters of the quantum network equipment at the current moment and each historical moment;

and inputting the performance values of the performance parameters at the current moment and each historical moment into any machine learning model, and outputting the performance predicted value of the performance parameter of the quantum network equipment at the next moment.

In a third aspect, an embodiment of the present invention further provides a device for predicting performance of a quantum network device, including:

the first acquisition unit is used for acquiring performance values of performance parameters of the quantum network equipment at the current moment and each historical moment;

and the second acquisition unit is used for inputting the performance values of the performance parameters of the current moment and each historical moment into any machine learning model and outputting the performance predicted value of the performance parameter of the quantum network equipment at the next moment.

In a fourth aspect, an embodiment of the present invention further provides an electronic device, including a memory and a processor, where the memory stores a computer program, and when the processor executes the computer program, the method described in any embodiment of the present specification is implemented.

In a fifth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program, which when executed in a computer, causes the computer to perform a method according to any of the embodiments of the present specification.

The embodiment of the invention provides a training method of a machine learning model and a performance prediction method of quantum network equipment, wherein a machine learning model is built, a change relation attention sub-layer, a correlation relation attention sub-layer and an evaluation relation attention sub-layer are added in an encoding layer of the machine learning model, so that in the training process of the machine learning model, the change relation attention sub-layer outputs intermediate codes based on a change relation matrix of each performance parameter in a time dimension, the correlation relation attention sub-layer outputs intermediate codes based on a correlation relation matrix between performance parameters, the evaluation relation attention sub-layer outputs intermediate codes based on an evaluation relation matrix between the performance parameters and a performance evaluation result, and therefore, attention enhancement can be respectively carried out on different relations of input samples, and further, the intermediate codes output by each attention sub-layer are fused, so that the performance value of the performance parameters is output by utilizing a fused attention coding sequence. According to the scheme, the machine learning model is trained based on the characteristics of a plurality of angles, so that the performance value of the output sub-network device can be predicted more accurately when the trained machine learning model predicts the performance.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a training method for a machine learning model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a machine learning model according to an embodiment of the present invention;

FIG. 3 is a flowchart of a method for determining intermediate codes of a change relationship attention sub-layer output according to an embodiment of the present invention;

FIG. 4 is a flow chart of a fusion method according to an embodiment of the present invention;

FIG. 5 is a hardware architecture diagram of an electronic device according to an embodiment of the present invention;

fig. 6 is a block diagram of a device for predicting performance of a quantum network device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.

Referring to fig. 1 and 2, an embodiment of the present invention provides a training method of a machine learning model, including:

step 100, constructing a machine learning model, the machine learning model comprising: an input layer, an encoding layer, and a decoding layer; the input layer is provided with a data processing unit, and the coding layer comprises a change relation attention sub-layer, a correlation relation attention sub-layer and an evaluation relation attention sub-layer;

102, forming a plurality of sample pairs based on performance values of performance parameters of quantum network equipment at each historical moment; the sample pair includes: setting a plurality of performance values of continuous historical moment performance parameters; the performance values of the performance parameters of the sample pair at the latest historical moment are output samples, and the performance values of the performance parameters of the sample pair at other historical moments are input samples;

step 104, training the machine learning model by using the plurality of samples;

In the embodiment of the invention, by constructing the machine learning model and adding the change relation attention sub-layer, the correlation relation attention sub-layer and the evaluation relation attention sub-layer in the coding layer of the machine learning model, the change relation attention sub-layer outputs the intermediate code based on the change relation matrix of each performance parameter in the time dimension in the training process of the machine learning model, the correlation relation attention sub-layer outputs the intermediate code based on the correlation relation matrix between the performance parameters, the evaluation relation attention sub-layer outputs the intermediate code based on the evaluation relation matrix between the performance parameters and the performance evaluation result, and thus, the attention of different relations of an input sample can be respectively enhanced, and the intermediate codes output by each attention sub-layer are fused, so that the performance value of the performance parameters is output by using the fused attention coding sequence. According to the scheme, the machine learning model is trained based on the characteristics of a plurality of angles, so that the performance value of the output sub-network device can be predicted more accurately when the trained machine learning model predicts the performance.

The manner in which the individual steps shown in fig. 1 are performed is described below.

First, step 102 will be described.

The quantum network device may include at least one of a quantum key distribution management server, a quantum key distribution device, a quantum key management device, a quantum VPN, and an optical quantum switch. The performance parameters corresponding to different amounts of sub-network devices are also different. For example, aiming at a quantum key distribution management server, the performance parameters of the quantum key distribution management server are CPU utilization rate, memory utilization rate and the number of configured quantum network devices; aiming at the quantum key distribution equipment, the performance parameters of the quantum key distribution equipment are CPU utilization rate, memory utilization rate, bit-rate, bit error rate and total current quantum key generation amount; for the quantum key management equipment, the performance parameters are CPU utilization rate, memory utilization rate and quantum key relay.

In the embodiment of the invention, the performance value of the performance parameter at the historical moment can be the performance value of each performance parameter of the quantum network equipment in the set historical time period. The historical time period may be within the past year or within the past month. The historical time may be time instances that are equally spaced within the historical time period. Such as the interval duration is a unit duration.

In the formation of the sample pairs, it can be assumed that the historical time corresponding to the obtained historical performance values is t from far to near _-50 、t _-49 、t _-48 、…、t _-2 、t _-1 One sample pair may be: t is t _-50 、t _-49 、t _-48 、…、t _-42 、t _-41 For the sample pair, the quantum network device is at the latest historical time t _-41 As output samples in the sample pair, the quantum network device at other historical times t _-50 、t _-49 、t _-48 、…、t _-42 、t _-42 As input samples in the sample pair; for another example, another sample pair is: t is t _-49 、t _-48 、t _-47 、…、t _-41 、t _-40 For the sample pair, the quantum network device is at the latest historical time t _-40 As output samples in the sample pair, the quantum network device at other historical times t _-49 、t _-48 、t _-47 、…、t _-41 As input samples in the sample pair.

And then will be described with respect to both steps 100 and 104.

In the embodiment of the invention, in order to improve the accuracy of the prediction result of the machine learning model, feature mining can be performed on the input sample from different angles, and specifically, the method comprises the following three angles:

angle one: and constructing a change relation attention sub-layer facing to the change relation in the time dimension.

The nonlinear change rule of the performance parameter in the time dimension can be obtained by utilizing the change relation attention sub-layer, so that the dependence of the performance parameter in the time dimension between front and back is obtained, and the output sample is predicted by utilizing the dependence.

And angle II: and constructing a correlation attention sub-layer facing to the correlation between the performance parameters.

The correlation attention sub-layer can acquire the change rule of the correlation between the performance parameters along with time, and record the correlation between the performance parameters when certain faults occur in the running process of the equipment so as to predict the output samples by using the correlation.

And angle III: and constructing an evaluation relationship attention sub-layer facing to the evaluation relationship between the performance parameters and the performance evaluation results.

The evaluation relationship attention sub-layer can acquire the change degree of the evaluation relationship between the performance parameters and the performance evaluation results in the running process of the equipment along with the time, and can continuously perfect learning along with the increase of the running time, so that the output sample can be predicted by using the evaluation relationship.

Compared with the method for predicting the performance value by using the change relation of the performance parameter in the angle I in the time dimension, in the embodiment of the invention, more characteristics of the data can be fully mined by adding the angle II and the angle III, so that the accuracy of the performance prediction can be improved.

In the training process of the machine learning model, an input sample is input to an input layer, and the input sample is processed by a data processing unit arranged in the input layer, specifically, the data processing unit respectively performs the following data processing from the three angles:

for the first angle, the data processing unit takes the performance value of each performance parameter in the input sample at each historical moment as an element in a change relation matrix to obtain the change relation matrix of each performance parameter in the time dimension; the element in the ith row and the jth column in the change relation matrix may be a performance value of the ith performance parameter at the jth historical moment.

Aiming at the angle II, the data processing unit calculates the association degree between any two performance parameters, and takes the association degree as an element in a correlation matrix to obtain the correlation matrix between the performance parameters; the element of the ith row and the jth column in the correlation matrix may be a degree of correlation between the ith performance parameter and the jth performance parameter.

Specifically, the degree of association between any two performance parameters is calculated as follows: acquiring a performance threshold of each performance parameter; and calculating the probability of any two performance parameters when the performance values of the two performance parameters exceed the corresponding performance thresholds at the same time, and taking the probability as the association degree between the two performance parameters.

For example, in the performance values corresponding to 1000 historical moments, the number of performance parameters 1 and 2 exceeding the corresponding performance threshold is 100, and then the element of row 1 and column 2 is 0.1.

Aiming at the angle III, calculating the probability of each performance evaluation result corresponding to the performance value of each performance parameter when the performance value exceeds the corresponding performance threshold value, and taking the probability as an element in an evaluation relation matrix to obtain the evaluation relation matrix between the performance parameter and the performance evaluation result; the element in the ith row and the jth column in the evaluation relation matrix may be probability that the ith performance parameter corresponds to the jth performance evaluation result when exceeding the corresponding performance threshold.

For example, assuming that there are three performance evaluation results, for the 1 st performance parameter, out of the performance values at 1000 historical moments, 100 performance values exceeding the corresponding performance threshold value, for the 100 performance values exceeding the corresponding performance threshold value, 10 corresponding first performance evaluation results, 20 corresponding second performance evaluation results, and 70 corresponding third performance evaluation results, the element of the 1 st row and 1 st column is 0.1, the element of the 1 st row and 2 nd column is 0.2, and the element of the 1 st row and 3 rd column is 0.7.

After the data processing unit obtains the relationship matrix corresponding to the three angles, the data processing unit inputs the change relationship matrix to the change relationship attention sub-layer, inputs the correlation relationship matrix to the correlation relationship attention sub-layer, and inputs the evaluation relationship matrix to the evaluation relationship attention sub-layer.

Next, the change relationship attention sub-layer, the correlation relationship attention sub-layer, and the evaluation relationship attention sub-layer encode the input relationship matrix, respectively, to output corresponding intermediate encodings. Specifically, the three attention sub-layers are encoded as follows:

change relationship attention sub-layer:

referring to fig. 3, the intermediate encoding of the change relation attention sub-layer output is determined by the following manner (steps 300-304):

step 300, forming a first hidden layer sequence based on the change relation matrix coding.

Assume that the change relation matrix is specific to [ a ] ₁ ,a ₂ ,…,a _t], wherein ,a_i The performance value of each performance parameter at the moment i (i=1, 2, …, t) is recorded as the performance state quantity at the moment i; the change relation attention sub-layer can utilize LSTM (Long Short-Term Memory network) to realize the mining of Long-distance dependent features in the change relation matrix so as to form a first hidden layer sequence { h } ₁ ,h ₂ ,…,h _t }。

Step 302, the change relation matrix is reversely ordered to form a reverse change relation matrix, and a second hidden layer sequence is formed based on the reverse change relation matrix.

Since in LSTM the performance state quantity is input based on time sequence, the mined dependency features are also forward dependency features based on historical data, and the mining of reverse dependency features is more needed for future data prediction, the change relation matrix can be reverse ordered to form a reverse change relation matrix, i.e. [ a ] _t ,…, a ₂ ,a ₁ ]And forming a second hidden layer sequence { h } using LSTM ₁ ’,h ₂ ’,…,h _t ’}。

The generation mode of the hidden layer sequence is calculated by adopting a sigmoid function and a tanh activation function which are set in the LSTM.

And 304, vector splicing is carried out on the first hidden layer sequence and the second hidden layer sequence, and the coded vector obtained after vector splicing is used as an intermediate code of the change relation attention sub-layer output.

In the embodiment of the invention, the coded vector obtained after vector splicing is as follows:

wherein ,outputting the coding of the moment i in the intermediate coding for the change relation attention sub-layer,/>Coding for the i-th in the first concealment sequence, for>I is encoded in the second concealment sequence.

Correlation attention sub-layer:

the intermediate code of the output of the correlation attention sub-layer is obtained by calculating the input correlation matrix by using a sigmoid function and a tanh activation function by using an LSTM, and the code of the i moment in the intermediate code of the output of the correlation attention sub-layer is that 。

Evaluation of relationship attention sub-layer:

similarly, the intermediate code of the output of the evaluation relationship attention sub-layer is obtained by calculating the input correlation matrix by using the sigmoid function and the tanh activation function by using the LSTM, and the code at the moment i in the intermediate code of the output of the evaluation relationship attention sub-layer is that。

Each attention sub-layer, after outputting the corresponding intermediate code, further needs to fuse the intermediate code, specifically, please refer to fig. 4, the fusing manner may include:

step 400, vector splicing is carried out on intermediate codes respectively output by the change relation attention sub-layer, the correlation relation attention sub-layer and the evaluation relation attention sub-layer, so as to obtain spliced code vectors;

step 402, determining the attention degree of the correlation between the performance parameters in the time dimension;

step 404, taking the sum of the products of the attention and the spliced coding vector as an output attention coding sequence.

In step 402, the attention of the correlation between the performance parameters in the time dimension may be calculated using the following formula:

wherein ,for normalizing the attention of the correlation between the i time and the j time after the normalization, +.>For normalizing the degree of interest of the correlation between the previous i moment and the j moment, < >>For normalizing the attention degree of the correlation between the previous i time and the previous k time, t is the number of historical time in the input sample, and V, W, U is a weight matrix; />Output value of hidden layer for (i-1), for (i-1)>Intermediate coding of the j moments of the attention sub-layer output for the change relation,/>Intermediate coding of the j moments of the output of the attention sub-layer for the correlation +.>Intermediate coding of the j moments of the output of the attention sub-layer for the evaluation relationship +.>The function is activated for tanh. />The encoding at the j moment in the spliced encoding vector is performed.

It should be noted that, the hidden layer is located in the decoding layer, and after the decoding layer obtains the spliced encoding vector, the output value of the hidden layer at each moment is obtained based on the spliced encoding vector, and the output sample is obtained by using the output value of the hidden layer at each moment.

Based on the above formula, the output attention code sequence is:

wherein ,A_i Attention is given to the code corresponding to time i in the code sequence.

And training the machine learning model by each sample pair according to the processing mode of the sample pair until a trained machine learning model is obtained.

The embodiment of the invention also provides a method for predicting the performance of the quantum network equipment, which comprises the following steps:

and inputting the performance values of the performance parameters at the current moment and each historical moment into the trained machine learning model, and outputting the performance predicted value of the performance parameters of the quantum network equipment at the next moment.

As shown in fig. 5 and fig. 6, the embodiment of the invention provides a device for predicting performance of quantum network equipment. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. In terms of hardware, as shown in fig. 5, a hardware architecture diagram of an electronic device where a quantum network device performance prediction apparatus based on a multi-prediction model is located according to an embodiment of the present invention is shown, where in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 5, the electronic device where the apparatus is located in the embodiment may generally include other hardware, such as a forwarding chip responsible for processing a packet, and so on. For example, as shown in fig. 6, the device in a logic sense is formed by reading a corresponding computer program in a nonvolatile memory into a memory by a CPU of an electronic device where the device is located. The device for predicting the performance of the quantum network device provided in this embodiment includes:

a first obtaining unit 601, configured to obtain performance values of performance parameters of the quantum network device at a current time and at each historical time;

a second obtaining unit 602, configured to input the performance values of the performance parameters at the current time and each historical time into the machine learning model in any of the foregoing embodiments, and output a performance predicted value of the performance parameter of the quantum network device at the next time.

It will be appreciated that the structure illustrated in the embodiments of the present invention does not constitute a specific limitation on a device for predicting performance of a quantum network device. In other embodiments of the invention, a quantum network device performance prediction apparatus may include more or fewer components than shown, or may combine certain components, or may split certain components, or may have a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The content of information interaction and execution process between the units in the device is based on the same conception as the embodiment of the method of the present invention, and specific content can be referred to the description in the embodiment of the method of the present invention, which is not repeated here.

The embodiment of the invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the performance prediction method of the quantum network equipment in any embodiment of the invention when executing the computer program.

The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor is caused to execute the quantum network device performance prediction method in any embodiment of the invention.

Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.

In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.

Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.

Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.

Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion module connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion module is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.

It is noted that relational terms such as first and second, and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one …" does not exclude the presence of additional identical elements in a process, method, article or apparatus that comprises the element.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: various media in which program code may be stored, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of training a machine learning model, comprising:

training the machine learning model with the plurality of samples;

in the machine learning model, the data processing unit is used for processing an input sample, inputting a change relation matrix of each performance parameter obtained by processing in a time dimension into the change relation attention sub-layer, inputting a correlation relation matrix between the performance parameters obtained by processing into the correlation relation attention sub-layer, and inputting an evaluation relation matrix between the performance parameters and performance evaluation results into the evaluation relation attention sub-layer; the decoding layer is used for fusing the intermediate codes output by each attention sub-layer to obtain an attention coding sequence, and outputting performance values of various performance parameters by using the attention coding sequence;

the merging of the intermediate codes output by each attention sub-layer to obtain an attention code sequence comprises the following steps: vector splicing is carried out on intermediate codes respectively output by the change relation attention sub-layer, the correlation relation attention sub-layer and the evaluation relation attention sub-layer, so as to obtain spliced code vectors; determining the attention degree of the correlation between the front and the back of the performance parameters in the time dimension; taking the sum of the products of the attention and the spliced coding vector as an output attention coding sequence;

the determining the attention degree of the correlation between the performance parameters in the time dimension comprises the following steps:

calculated using the following formula:

wherein ,for normalizing the attention of the correlation between the i time and the j time after the normalization, +.>For normalizing the degree of interest of the correlation between the previous i moment and the j moment, < >>For normalizing the attention degree of the correlation between the previous i time and the previous k time, t is the number of historical time in the input sample, and V, W, U is a weight matrix; />Output value of hidden layer for (i-1), for (i-1)>Intermediate coding of the j moments of the attention sub-layer output for the change relation,/>Intermediate coding of the j moments of the output of the attention sub-layer for the correlation +.>Intermediate coding of the j moments of the output of the attention sub-layer for the evaluation relationship +.>The function is activated for tanh.

2. The method of claim 1, wherein the processing the input samples comprises:

taking the performance value of each performance parameter in the input sample at each historical moment as an element in a change relation matrix to obtain the change relation matrix of each performance parameter in the time dimension;

calculating the association degree between any two performance parameters, and taking the association degree as an element in a correlation matrix to obtain the correlation matrix between the performance parameters;

and calculating the probability of each performance evaluation result corresponding to the performance value of each performance parameter when the performance value exceeds the corresponding performance threshold value, and taking the probability as an element in an evaluation relation matrix to obtain the evaluation relation matrix between the performance parameter and the performance evaluation result.

3. The method according to claim 2, wherein calculating the degree of association between any two performance parameters comprises:

acquiring a performance threshold of each performance parameter;

and calculating the probability of any two performance parameters when the performance values of the two performance parameters exceed the corresponding performance thresholds at the same time, and taking the probability as the association degree between the two performance parameters.

4. The method of claim 1, wherein the intermediate encoding of the change relation attention sub-layer output is determined by:

forming a first hidden layer sequence based on the change relation matrix code;

the change relation matrix is subjected to reverse sequencing to form a reverse change relation matrix, and a second hidden layer sequence is formed based on the reverse change relation matrix;

and vector splicing is carried out on the first hidden layer sequence and the second hidden layer sequence, and a coded vector obtained after vector splicing is used as an intermediate code of the output of the change relation attention sub-layer.

5. A method for predicting performance of a quantum network device, comprising:

inputting the performance values of the performance parameters at the current moment and each historical moment into a machine learning model, and outputting the performance predicted value of the performance parameters of the quantum network equipment at the next moment, wherein the machine learning model is trained by using the training method of the machine learning model according to any one of claims 1-4.

6. A quantum network device performance prediction apparatus, comprising:

the second obtaining unit is configured to input the performance values of the performance parameters at the current time and each historical time into a machine learning model, and output a performance predicted value of the performance parameter of the quantum network device at the next time, where the machine learning model is obtained by training the machine learning model according to the training method of any one of claims 1-4.

7. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of any of claims 1-4 or claim 5 when the computer program is executed.

8. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-4 or claim 5.