CN118018773A

CN118018773A - Self-learning cloud video generation method and device and computer equipment

Info

Publication number: CN118018773A
Application number: CN202410411526.2A
Authority: CN
Inventors: 王曜; 吴光冠
Original assignee: Shenzhen Yuntian Changxiang Information Technology Co ltd
Current assignee: Shenzhen Yuntian Changxiang Information Technology Co ltd
Priority date: 2024-04-08
Filing date: 2024-04-08
Publication date: 2024-05-10
Anticipated expiration: 2044-04-08
Also published as: CN118018773B

Abstract

The invention relates to the technical field of cloud computing, in particular to a self-learning cloud video generation method, a self-learning cloud video generation device and computer equipment, which comprise the following steps: performing deep learning according to a plurality of cloud video frames by using a generation countermeasure network GAN to obtain a non-directional generation network for generating the cloud video frames in a non-directional manner; performing deep learning according to a plurality of cloud video frames by utilizing a time sequence prediction network LSTM to obtain a directional generation network for directional generation of the cloud video frames; and utilizing stacking an integration model, and carrying out integrated learning on the non-directional network and the directional network according to a plurality of cloud video frames to obtain a self-learning generation network for carrying out self-adaptive decision on the directional and non-directional generation results of the cloud video frames. According to the cloud video frame self-adaptive decision making method, a stacking integrated model is utilized to construct a self-learning generation network, self-adaptive decision making is carried out on the directional and non-directional generation results of the cloud video frame, and decision making is carried out in generalization performance and precision performance, so that generalization is realized when precision is ensured, and precision is not lost.

Description

Self-learning cloud video generation method and device and computer equipment

Technical Field

The invention relates to the technical field of cloud computing, in particular to a self-learning cloud video generation method, a self-learning cloud video generation device and computer equipment.

Background

Cloud video (Cloud video) refers to a video network platform service based on Cloud computing business model applications. On the cloud platform, all video suppliers, agents, planning service providers, manufacturers, industry associations, management institutions, industry media, legal structures and the like are integrated into a resource pool in a concentrated cloud mode, all resources are mutually displayed and interacted, communication is achieved as required, intent is achieved, and therefore cost is reduced and efficiency is improved.

At present, a method for acquiring sample data and sample labels is mostly adopted for cloud video generation, usually, manual labeling is relied on, and then video generation is carried out according to labeled labels. The method only focuses on the accuracy of cloud video generation, but ignores the generalization performance of cloud video generation, so that the cloud video is easy to have diversity and limited inclusion, and only aims at the local characteristics of the generated cloud video, and finally the problem of poor cloud video generation quality is caused.

Disclosure of Invention

The invention aims to provide a self-learning cloud video generation method, a self-learning cloud video generation device and computer equipment, which are used for solving the technical problems that in the prior art, only the accuracy of cloud video generation is focused, the generalization performance of cloud video generation is ignored, and the cloud video is easy to have diversity and limited inclusion.

In order to solve the technical problems, the invention specifically provides the following technical scheme:

in a first aspect of the present invention, a self-learning cloud video generation method includes the steps of:

Acquiring cloud video, wherein the cloud video comprises a plurality of cloud video frames arranged in time sequence;

Performing deep learning according to a plurality of cloud video frames by using a generation countermeasure network GAN to obtain a non-directional generation network for generating the cloud video frames in a non-directional manner;

performing deep learning according to a plurality of cloud video frames by utilizing a time sequence prediction network LSTM to obtain a directional generation network for directional generation of the cloud video frames;

And utilizing stacking an integration model, and carrying out integrated learning on the non-directional network and the directional network according to a plurality of cloud video frames to obtain a self-learning generation network for carrying out self-adaptive decision on the directional and non-directional generation results of the cloud video frames.

As a preferred embodiment of the present invention, the method for constructing the non-directional generation network includes:

taking the cloud video frames at the front time sequence in the plurality of cloud video frames as input items for generating a generating model in the countermeasure network, and outputting the predicted cloud video frames at the rear time sequence by the generating model;

Taking a cloud service frame at a rear time sequence in the plurality of cloud video frames and a predicted cloud service frame at the rear time sequence as input items for generating a discrimination model in an countermeasure network, and outputting an evaluation result of the generation model by the discrimination model;

Training the generated countermeasure network to reach a highest evaluation result based on the evaluation result;

taking the generated countermeasure network with the highest evaluation result as the non-directional generation network;

The non-directional generation network is as follows: g _n=GAN({G₁,G₂,G₃,…,G_n-1 }); wherein, G _n is a predicted cloud video frame, { G ₁,G₂,G₃,…,G_n-1 } is a cloud video formed by n-1 cloud video frames which are arranged in front of the predicted video frame, G ₁,G₂,G₃,G_n-1 is a cloud video frame at the 1 st, 2 nd, 3 rd and n-1 st time sequence in { G ₁,G₂,G₃,…,G_n-1 } respectively, and GAN generates an impedance network. As a preferred embodiment of the present invention, the method for constructing the directional generation network includes:

Taking the cloud video frames at the front time sequence in the plurality of cloud video frames as an input item of a time sequence prediction network, and taking the cloud video frames at the rear time sequence in the plurality of cloud video frames as an output item of the time sequence prediction network;

using a time sequence prediction network to learn a time sequence mapping relation between an input item of the time sequence prediction network and an output item of the time sequence prediction network to obtain the directional generation network;

The directional generation network is as follows:

G _n=LSTM({G₁,G₂,G₃,…,G_n-1 }); wherein, G _n is a predicted cloud video frame, { G ₁,G₂,G₃,…,G_n-1 } is a cloud video formed by n-1 cloud video frames which are arranged in front of the predicted video frame, G ₁,G₂,G₃,G_n-1 is a cloud video frame at the 1 st, 2 nd, 3 rd and n-1 st time sequence in { G ₁,G₂,G₃,…,G_n-1 } respectively, and LSTM is a time sequence prediction network.

As a preferable scheme of the invention, the construction method of the self-learning generation network comprises the following steps:

using the output items of the non-directional generation network and the directional generation network as input items of stacking integrated models, and outputting optimal prediction cloud video frames by the stacking integrated models;

constructing stacking a loss function of the integrated model by using the self-adaptive function, and training stacking the integrated model by using the loss function of the stacking integrated model to obtain the self-learning generation network;

The self-learning generation network is as follows: g _n-best=stacking(G_n-GAN, G_n-LSTM); wherein, G _n-best is the optimal prediction cloud video frame, G _n-GAN is the output item of the non-directional generation network, G _n-LSTM is the output item of the directional generation network, and stacking is the stacking integration model.

As a preferable scheme of the invention, the construction method of the stacking integrated model loss function comprises the following steps:

based on a double-optimization principle of generalization of a self-learning generation network to a non-directional generation network and accuracy of a directional generation network, constructing a stacking integrated model loss function according to the non-directional generation network and the directional generation network loss function by utilizing an adaptive function, wherein the adaptive function is an S-shaped growth function;

The stacking integrated model has a loss function of:

Loss_stacking=K*Loss_GAN+L*Loss_LSTM；

L=1/[1+e^-(x-M)]；

K=1/[ 1+e ^(x-M) ]; where loss_ stacking is a Loss function of the stacking integrated model, loss_gan is a Loss function of the non-directional generation network, loss_lstm is a Loss function of the directional generation network, K is an adaptive function of the non-directional generation network, L is an adaptive function of the directional generation network, x is a training phase of the stacking integrated model, and M is a constant term. As a preferred scheme of the present invention, the loss function of the non-directional generation network is the difference between the predicted cloud video frame output by the non-directional generation network and the real value of the cloud video frame corresponding to the predicted cloud video frame, and the difference is quantized by mean square error, and the loss function of the non-directional generation network is: loss_gan=mse (G _n,G_real), where loss_gan is the Loss function of the non-directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value. As a preferred scheme of the present invention, the loss function of the directional generation network is the difference between the predicted cloud video frame output by the directional generation network and the real value of the cloud video frame corresponding to the predicted cloud video frame, and the difference is quantized by mean square error, and the loss function of the directional generation network is: loss_lstm=mse (G _n,G_real), where loss_lstm is the Loss function of the directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value.

As a preferred embodiment of the present invention, the stacking integrated model includes any one of a classifier and a neural network.

In a second aspect of the present invention, a self-learning cloud video generating device is applied to the self-learning cloud video generating method, and the self-learning cloud video generating device includes:

The cloud video processing device comprises a data acquisition unit, a data processing unit and a data processing unit, wherein the data acquisition unit is used for acquiring cloud video, and the cloud video comprises a plurality of cloud video frames arranged in time sequence;

The deep learning unit is used for performing deep learning according to a plurality of cloud video frames by utilizing the generation countermeasure network GAN to obtain a non-directional generation network for generating the cloud video frames in a non-directional manner;

The method comprises the steps that a time sequence prediction network LSTM is utilized, deep learning is conducted according to a plurality of cloud video frames, and a directional generation network for directional generation of the cloud video frames is obtained;

Utilizing stacking integrated models to integrate and learn the non-directional network and the directional network according to a plurality of cloud video frames to obtain a self-learning generation network for carrying out self-adaptive decision on the directional and non-directional generation results of the cloud video frames;

And the video generation unit is used for generating cloud video by utilizing the self-learning generation network.

In a third aspect of the invention, a computer device comprises: at least one processor; and

A memory communicatively coupled to the at least one processor;

Wherein the memory stores instructions executable by the at least one processor to cause the computer device to perform a self-learning cloud video generation method.

Compared with the prior art, the invention has the following beneficial effects:

According to the invention, the deep learning operation is respectively carried out by utilizing the generation countermeasure network GAN and the time sequence prediction network LSTM to generate a cloud video frame, a cloud video generation model with generalization performance and a cloud video generation model with precision performance are correspondingly constructed, a self-learning generation network is constructed by utilizing stacking integrated models, self-adaptive decision is carried out on the directional and non-directional generation results of the cloud video frame, decision is carried out in generalization performance and precision performance, generalization performance is realized when precision is ensured, and precision is not lost.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.

Fig. 1 is a flowchart of a self-learning cloud video generation method based on computing power requirements according to an embodiment of the present invention;

Fig. 2 is a block diagram of a self-learning cloud video generating device based on computing power requirements according to an embodiment of the present invention;

Fig. 3 is an internal structure diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, in a first aspect of the present invention, a self-learning cloud video generation method includes the following steps:

According to the method, the antagonism network GAN is generated, the time sequence rule deep learning is carried out on the cloud video frames, so that the follow-up cloud video is predicted and produced according to the existing cloud video, and the cloud video is generated through a deep learning method.

The generation of the countering network GAN has random characteristics and surprise on image generation, so that the generation of the cloud video by using the generation of the countering network GAN has the advantage of non-directionality, namely prominent generalization performance, and the generalization capability of the cloud video generation is ensured.

According to the invention, the time sequence prediction network LSTM is used for performing time sequence regular deep learning on the cloud video frames, so that the follow-up cloud video is predicted and produced according to the existing cloud video, and the cloud video is generated by a deep learning method.

The time sequence prediction network LSTM has sequence in image generation, so that the generation of cloud video by using the generation countermeasure network GAN has the advantages of directionality, i.e. outstanding precision performance, so that the precision capability of cloud video generation is ensured.

Therefore, by using the generation countermeasure network GAN and the time sequence prediction network LSTM, the cloud video frames are generated by performing the deep learning operation, and the cloud video generation model with generalization performance and the cloud video generation model with precision performance are correspondingly constructed.

Furthermore, the invention utilizes stacking integrated model to integrate and learn the non-directional network and the directional network to obtain the self-learning generation network for self-adapting decision of the directional and non-directional generation results of the cloud video frames.

And integrating the non-directional network and the directional network by utilizing stacking integration models, so that the self-learning generation network has generalization capability and precision capability when generating cloud videos, and the quality of cloud video generation is ensured.

According to the invention, through the loss function of stacking integrated models, the non-directional network and the directional network are subjected to integrated learning, so that the self-adaptive function value of the non-directional network is close to 1 and the self-adaptive function value of the directional network is close to 0 in the initial stage of cloud video generation, the generalization performance of cloud video generation is ensured, all information generated by the cloud video is collected, and the loss of generated information is avoided.

Along with the development of the cloud video generation stage, the self-adaptive function of the non-directional network is reduced from 1, the self-adaptive function value of the directional network is increased from 0, the generalization performance is shifted to the precision performance, and therefore all information generated by the cloud video is screened step by step until the final cloud video generation stage, the self-adaptive function value of the non-directional network is close to 0, and the self-adaptive function value of the directional network is close to 1, so that the precision performance of the cloud video generation is guaranteed in the final cloud video generation stage, all information generated by the cloud video is screened, and the most precise information is reserved for generating the cloud video.

Therefore, a stacking integrated model is utilized to construct a self-learning generation network, self-adaptive decision is carried out on the directional and non-directional generation results of the cloud video frames, and decision is carried out in generalization performance and precision performance, so that generalization performance is realized when precision is ensured, and precision is not lost.

According to the invention, the deep learning of a time sequence rule is carried out on the cloud video frame by generating the countermeasure network GAN, so that the follow-up cloud video is predicted and produced according to the existing cloud video, and the cloud video is generated by a deep learning method, and the method comprises the following specific steps:

the method for constructing the non-directional generation network comprises the following steps:

training to generate an countermeasure network based on the evaluation result so as to reach the highest evaluation result;

Taking the generated countermeasure network with the highest evaluation result as a non-directional generation network;

The non-directional generation network is:

G _n=GAN({G₁,G₂,G₃,…,G_n-1 }); wherein, G _n is a predicted cloud video frame, { G ₁,G₂,G₃,…,G_n-1 } is a cloud video formed by n-1 cloud video frames which are arranged in front of the predicted video frame, G ₁,G₂,G₃,G_n-1 is a cloud video frame at the 1 st, 2 nd, 3 rd and n-1 st time sequence in { G ₁,G₂,G₃,…,G_n-1 } respectively, and GAN generates an impedance network.

According to the invention, the time sequence prediction network LSTM is used for performing time sequence regular deep learning on the cloud video frames, so that the follow-up cloud video is generated according to the existing cloud video prediction, and the cloud video is generated by a deep learning method, and the method comprises the following specific steps:

the method for constructing the directional generation network comprises the following steps:

Using a time sequence prediction network to learn a time sequence mapping relation between an input item of the time sequence prediction network and an output item of the time sequence prediction network to obtain a directional generation network;

The directional generation network is as follows:

According to the cloud video generation method, a stacking integration model is utilized to integrate the non-directional network and the directional network, so that the self-learning generation network has generalization capability and precision capability when generating the cloud video, and the cloud video generation quality is ensured, and the method specifically comprises the following steps:

The construction method of the self-learning generation network comprises the following steps:

Constructing stacking a loss function of the integrated model by using the self-adaptive function, and training stacking the integrated model by using the loss function of the stacking integrated model to obtain a self-learning generation network;

the self-learning generation network is as follows:

G _n-best=stacking(G_n-GAN, G_n-LSTM); wherein, G _n-best is the optimal prediction cloud video frame, G _n-GAN is the output item of the non-directional generation network, G _n-LSTM is the output item of the directional generation network, and stacking is the stacking integration model.

The construction method of the stacking integrated model loss function comprises the following steps:

the loss function of stacking integrated model is:

Loss_stacking=K*Loss_GAN+L*Loss_LSTM；

L=1/[1+e^-(x-M)]；

K=1/[1+e^(x-M)]；

Where loss_ stacking is a Loss function of the stacking integrated model, loss_gan is a Loss function of the non-directional generation network, loss_lstm is a Loss function of the directional generation network, K is an adaptive function of the non-directional generation network, L is an adaptive function of the directional generation network, x is a training phase of the stacking integrated model, and M is a constant term. According to the invention, through the loss function of stacking integrated models, the non-directional network and the directional network are subjected to integrated learning, so that the self-adaptive function value of the non-directional network is close to 1 and the self-adaptive function value of the directional network is close to 0 in the initial stage of cloud video generation, the generalization performance of cloud video generation is ensured, all information generated by the cloud video is collected, and the loss of generated information is avoided.

Therefore, a stacking integrated model is utilized to construct a self-learning generation network, self-adaptive decision is carried out on the directional and non-directional generation results of the cloud video frames, decision is carried out in generalization performance and precision performance, generalization performance is realized when precision is ensured, and generalization performance is ensured without losing precision

The loss function of the non-directional generation network is the difference between the predicted cloud video frame output by the non-directional generation network and the real value of the cloud video frame corresponding to the predicted cloud video frame, the quantization is carried out by mean square error, and the loss function of the non-directional generation network is as follows: loss_gan=mse (G _n,G_real), where loss_gan is the Loss function of the non-directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value.

The loss function of the directional generation network is the difference between the predicted cloud video frame output by the directional generation network and the cloud video frame true value corresponding to the predicted cloud video frame, quantization is carried out by mean square error, and the loss function of the directional generation network is as follows: loss_lstm=mse (G _n,G_real), where loss_lstm is the Loss function of the directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value.

Stacking the integration model includes any one of a classifier, a neural network.

As shown in fig. 2, in a second aspect of the present invention, a self-learning cloud video generating apparatus is applied to a self-learning cloud video generating method, where the self-learning cloud video generating apparatus includes:

the data acquisition unit is used for acquiring cloud video, wherein the cloud video comprises a plurality of cloud video frames arranged in time sequence;

As shown in fig. 3, in a third aspect of the present invention, a computer apparatus includes: at least one processor; and

A memory communicatively coupled to the at least one processor;

the memory stores instructions executable by the at least one processor to cause the computer device to perform a self-learning cloud video generation method.

The above embodiments are only exemplary embodiments of the present application and are not intended to limit the present application, the scope of which is defined by the claims. Various modifications and equivalent arrangements of this application will occur to those skilled in the art, and are intended to be within the spirit and scope of the application.

Claims

1. The self-learning cloud video generation method is characterized by comprising the following steps of:

2. The self-learning cloud video generation method of claim 1, wherein the method comprises the steps of: the method for constructing the non-directional generation network comprises the following steps:

The non-directional generation network is as follows:

3. The self-learning cloud video generation method of claim 2, wherein the method comprises the steps of: the construction method of the directional generation network comprises the following steps:

The directional generation network is as follows: g _n=LSTM({G₁,G₂,G₃,…,G_n-1 }); wherein, G _n is a predicted cloud video frame, { G ₁,G₂,G₃,…,G_n-1 } is a cloud video formed by n-1 cloud video frames which are arranged in front of the predicted video frame, G ₁,G₂,G₃,G_n-1 is a cloud video frame at the 1 st, 2 nd, 3 rd and n-1 st time sequence in { G ₁,G₂,G₃,…,G_n-1 } respectively, and LSTM is a time sequence prediction network.

4. The self-learning cloud video generation method of claim 3, wherein: the construction method of the self-learning generation network comprises the following steps:

5. The self-learning cloud video generation method according to claim 4, wherein: the construction method of the stacking integrated model loss function comprises the following steps:

The stacking integrated model has a loss function of:

Loss_stacking=K*Loss_GAN+L*Loss_LSTM；

L=1/[1+e^-(x-M)]；

K=1/[ 1+e ^(x-M) ]; where loss_ stacking is a Loss function of the stacking integrated model, loss_gan is a Loss function of the non-directional generation network, loss_lstm is a Loss function of the directional generation network, K is an adaptive function of the non-directional generation network, L is an adaptive function of the directional generation network, x is a training phase of the stacking integrated model, and M is a constant term.

6. The self-learning cloud video generation method according to claim 5, wherein: the loss function of the non-directional generation network is the difference between the predicted cloud video frame output by the non-directional generation network and the real value of the cloud video frame corresponding to the predicted cloud video frame, and the quantization is carried out by mean square error, and the loss function of the non-directional generation network is as follows: loss_gan=mse (G _n,G_real), where loss_gan is the Loss function of the non-directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value.

7. The self-learning cloud video generation method of claim 6, wherein the method comprises the steps of: the loss function of the directional generation network is the difference between the predicted cloud video frame output by the directional generation network and the real value of the cloud video frame corresponding to the predicted cloud video frame, and the difference is quantized by mean square error, and the loss function of the directional generation network is as follows: loss_lstm=mse (G _n,G_real), where loss_lstm is the Loss function of the directional generation network, MSE is the mean square error, G _n is the predicted cloud video frame, and G _real is the corresponding cloud video frame true value.

8. The self-learning cloud video generation method of claim 7, wherein: the stacking integrated model comprises any one of a classifier and a neural network.

9. A self-learning cloud video generation device, which is applied to the self-learning cloud video generation method of any one of claims 1 to 8, and comprises:

10. A computer device, comprising: at least one processor; and

A memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause a computer device to perform the method of any of claims 1-8.