CN116070170A

CN116070170A - A method and system for cloud-side-device data fusion processing based on deep learning

Info

Publication number: CN116070170A
Application number: CN202310060889.1A
Authority: CN
Inventors: 袁小芳; 许浩志; 李哲; 王耀南
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2023-01-15
Filing date: 2023-01-15
Publication date: 2023-05-05

Abstract

The invention discloses a cloud edge data fusion processing method and a cloud edge data fusion processing system based on deep learning, which are used for presetting a cloud edge data fusion model; the method comprises the steps of obtaining multi-source heterogeneous data through terminal equipment and processing the multi-source heterogeneous data to obtain various homologous heterogeneous data; training a cloud edge end data fusion model, inputting various homologous heterogeneous data to an edge computing platform in the fusion model, processing to obtain a plurality of advanced fusion feature vectors and a plurality of edge decisions, and feeding back the edge decisions to a terminal device and feeding back to a cloud computing center in the fusion model; inputting the plurality of high-level fusion feature vectors to a cloud computing center, obtaining a cloud decision through cloud computing center model processing, and feeding back the cloud decision to the terminal equipment and the edge computing platform. By establishing a distributed data processing system, the method not only can effectively solve the communication and calculation problems caused by multi-source heterogeneous data, but also can promote the overall decision making and scheduling capability of the system and promote the comprehensive development of the Internet of things.

Description

A method and system for cloud-side-device data fusion processing based on deep learning

技术领域technical field

本发明涉及云边端数据融合领域，具体涉及一种基于深度学习的云边端数据融合处理方法及系统。The present invention relates to the field of cloud-side-device data fusion, in particular to a method and system for processing cloud-side-device data fusion based on deep learning.

背景技术Background technique

随着万物互联时代到来，计算需求出现爆发式增长。传统的云计算架构已无法满足互联网流量暴增和数据几何式增长所带来的海量数据计算需求，云计算的传统架构正在放缓。并且随着边缘计算平台和终端设备的计算和存储能力不断提升，将云计算任务下沉到边缘侧、设备侧，并打造云边端协同架构，将是未来重要的发展趋势。With the advent of the Internet of Everything era, computing demands have exploded. The traditional cloud computing architecture has been unable to meet the massive data computing demands brought about by the explosion of Internet traffic and the geometric growth of data, and the traditional architecture of cloud computing is slowing down. And as the computing and storage capabilities of edge computing platforms and terminal devices continue to improve, it will be an important development trend in the future to sink cloud computing tasks to the edge side and device side, and to create a cloud-edge-device collaborative architecture.

传统的云计算是以整体收集的方式进行云计算中心的建设和运营的，客户需要通过互联网使用云计算资源。随着云计算技术的发展和应用程序的普及，云计算出现了以下限制：第一，对于前端采集数据太大的情况，传统数据上传方式成本高、效率低；第二，对于需要实时交互的场景，所有数据和计算集中于计算中心，信息传输成本高、延时高；第三，对于高稳定性和连续性的应用需求，强依赖于稳定的云服务和网络，导致应用需求的鲁棒性和可靠性降低。In traditional cloud computing, the construction and operation of cloud computing centers are carried out in the form of overall collection, and customers need to use cloud computing resources through the Internet. With the development of cloud computing technology and the popularization of applications, cloud computing has the following limitations: first, for the situation where the data collected by the front end is too large, the traditional data upload method is costly and inefficient; second, for real-time interaction In the scenario, all data and calculations are concentrated in the computing center, and the cost of information transmission is high, and the delay is high; third, the application requirements for high stability and continuity are strongly dependent on stable cloud services and networks, resulting in robust application requirements and reduced reliability.

因此，边缘计算开始走入各大互联网、云服务厂商视线当中。边缘计算主要是在网络边缘、数据源附近进行数据处理，边缘平台将计算、存储、传输和自我管理融合在一起，其实时响应的特性将极大地提高数据采集和进行高级应用的效率。Therefore, edge computing has begun to come into the sight of major Internet and cloud service vendors. Edge computing is mainly for data processing at the edge of the network and near the data source. The edge platform integrates computing, storage, transmission and self-management. Its real-time response characteristics will greatly improve the efficiency of data collection and advanced applications.

目前，云边端一体化协同计算体系处于起步阶段。且不同于传统云计算的集中式多源异构数据融合，云边端协同的分布式多源异构数据融合将面临诸多挑战。其中，如何有效分配和利用各层次计算资源，提升数据处理速度和能力，有效训练分布式深度学习模型，完成对多源异构数据和信息的融合等是亟待解决的问题。At present, the cloud-edge-device integrated collaborative computing system is in its infancy. And different from the centralized multi-source heterogeneous data fusion of traditional cloud computing, the distributed multi-source heterogeneous data fusion of cloud-edge-device collaboration will face many challenges. Among them, how to effectively allocate and utilize computing resources at all levels, improve data processing speed and capabilities, effectively train distributed deep learning models, and complete the fusion of multi-source heterogeneous data and information are urgent problems to be solved.

发明内容Contents of the invention

本发明设计了一种基于深度学习的云边端数据融合处理模型和方法，以完成云边端协同一体化中多源异构数据的高效处理和有效融合。The present invention designs a cloud-side-device data fusion processing model and method based on deep learning to complete the efficient processing and effective fusion of multi-source heterogeneous data in the collaborative integration of cloud-side-device.

一种基于深度学习的云边端数据融合处理方法，所述方法包括：A deep learning-based cloud-side-end data fusion processing method, the method comprising:

S1、搭建云边端数据融合模型，云边端数据融合模型包括多个终端设备、边缘计算平台、边缘计算平台模型、云计算中心、云计算中心模型，边缘计算平台模型设置在边缘计算平台上，云计算中心模型设置在云计算中心上，多个终端设备和边缘计算平台连接，边缘计算平台和云计算中心连接，云计算中心和多个终端设备连接；S1. Build a cloud-edge-end data fusion model. The cloud-edge-end data fusion model includes multiple terminal devices, an edge computing platform, an edge computing platform model, a cloud computing center, and a cloud computing center model. The edge computing platform model is set on the edge computing platform , the cloud computing center model is set on the cloud computing center, multiple terminal devices are connected to the edge computing platform, the edge computing platform is connected to the cloud computing center, and the cloud computing center is connected to multiple terminal devices;

S2、利用多个终端设备获取多源异构数据后，对多源异构数据进行预处理，得到多种同源异构数据；S2. After using multiple terminal devices to obtain multi-source heterogeneous data, preprocess the multi-source heterogeneous data to obtain various homogeneous and heterogeneous data;

S3、采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，反向传播并更新参数，直至得到训练后的云边端数据融合模型，云边端数据融合模型训练结束后，将多个终端设备采集并处理后的多种同源异构数据作为数据集输入至训练后的云边端数据融合模型；S3. Use a variety of homogeneous and heterogeneous data as the training set to train the cloud-side data fusion model, backpropagate and update parameters until the trained cloud-side data fusion model is obtained, and the cloud-side data fusion model training ends Finally, a variety of homogeneous and heterogeneous data collected and processed by multiple terminal devices are input as data sets into the trained cloud-side-end data fusion model;

S4、训练后的云边端数据融合模型中的边缘计算平台接收数据集中的多种同源异构数据后，通过边缘计算平台模型进行处理，得到多个高级融合特征向量和多个边决策，将多个高级融合特征向量传输给云计算中心，将多个边决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将多个边决策前馈给云计算中心；S4. After the edge computing platform in the trained cloud-edge-end data fusion model receives a variety of homogeneous and heterogeneous data in the data set, it processes through the edge computing platform model to obtain multiple advanced fusion feature vectors and multiple edge decisions. Transmit multiple advanced fusion feature vectors to the cloud computing center, feed back multiple side decisions to multiple terminal devices, control the collection of multi-source heterogeneous data by multiple terminal devices, and feed forward multiple side decisions to the cloud computing center ;

S5、训练后的云边端数据融合模型中的云计算中心接收多个高级融合特征向量，通过云计算中心模型进行处理，得到云决策；云计算中心根据前馈的多个边决策控制云计算中心的云决策；云计算中心将云决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将云决策反馈给边缘计算平台，控制边缘计算平台的边决策。S5. The cloud computing center in the trained cloud-edge-end data fusion model receives multiple high-level fusion feature vectors, processes them through the cloud computing center model, and obtains cloud decision-making; the cloud computing center controls cloud computing according to multiple feedforward edge decisions Cloud decision-making at the center; the cloud computing center feeds back cloud decision-making to multiple terminal devices, controls the collection of multi-source heterogeneous data by multiple terminal devices, feeds back cloud decision-making to the edge computing platform, and controls the side decision-making of the edge computing platform.

优选地，S2中利用多个终端设备获取多源异构数据后，对多源异构数据进行预处理，得到多种同源异构数据，具体过程包括：Preferably, after obtaining multi-source heterogeneous data by using multiple terminal devices in S2, the multi-source heterogeneous data is preprocessed to obtain a variety of homogeneous and heterogeneous data. The specific process includes:

S21、利用多个终端设备获取多源异构数据；S21. Using multiple terminal devices to obtain multi-source heterogeneous data;

S22、通过时间戳采样和线性插值的方式将多源异构数据校准为具有同步频率的多源异构数据；S22. Calibrate the multi-source heterogeneous data into multi-source heterogeneous data with synchronous frequency by means of time stamp sampling and linear interpolation;

S23、对具有同步频率的多源异构数据进行标准化处理，得到多种同源异构数据。S23. Standardize the multi-source heterogeneous data with synchronous frequency to obtain multiple homogeneous and heterogeneous data.

优选地，S23中得到多种同源异构数据，具体公式为：Preferably, multiple homologous and heterogeneous data are obtained in S23, and the specific formula is:

其中，x_ij'表示第i种同源异构数据中的第j个同源异构数据，x_ij表示第i种同源异构数据中的第j个标准化前的同源异构数据，j＝1,2,...,N，N表示第i种同源异构数据的总数，μ_i、δ_i表示第i种同源异构数据整体的均值和方差。Among them, x _ij ' represents the jth homologous heterogeneous data in the i-th homologous heterogeneous data, x _ij represents the j-th homologous heterogeneous data before normalization in the i-th homologous heterogeneous data, j=1,2,...,N, N represents the total number of the i-th homologous heterogeneous data, μ _i and δ _i represent the overall mean and variance of the i-th homologous heterogeneous data.

优选地，边缘计算平台模型包括多个并行设置的计算节点模型，S4中训练后的云边端数据融合模型中的边缘计算平台接收数据集中的多种同源异构数据后，通过边缘计算平台模型进行处理，得到多个高级融合特征向量和多个边决策，具体包括：Preferably, the edge computing platform model includes a plurality of computing node models set in parallel, and after the edge computing platform in the cloud-edge-end data fusion model trained in S4 receives various homologous and heterogeneous data in the data set, it passes through the edge computing platform The model is processed to obtain multiple advanced fusion feature vectors and multiple edge decisions, including:

S41、将数据集中的多种同源异构数据存储在多个计算节点模型中，其中一个计算节点模型存储一种同源异构数据；S41. Store various homologous and heterogeneous data in the data set in multiple computing node models, wherein one computing node model stores a kind of homologous and heterogeneous data;

S42、计算节点模型包括神经网络、第一多层感知机和第一Softmax分类器，神经网络用于对存储于对应计算节点模型中的同源异构数据进行特征提取，得到初级融合特征向量，第一多层感知机对初级融合特征向量进行特征级融合，得到高级融合特征向量；S42. The computing node model includes a neural network, a first multi-layer perceptron and a first Softmax classifier, and the neural network is used to perform feature extraction on homologous and heterogeneous data stored in the corresponding computing node model to obtain a primary fusion feature vector, The first multi-layer perceptron performs feature-level fusion on the primary fusion feature vector to obtain the high-level fusion feature vector;

S43、第一Softmax分类器用于对高级融合特征向量进行分类，得到高级融合特征向量中每个特征对应的初步决策；S43. The first Softmax classifier is used to classify the high-level fusion feature vector, and obtain a preliminary decision corresponding to each feature in the high-level fusion feature vector;

S44、根据归一化方法计算初步决策的决策概率，选取决策概率值最大的初步决策作为对应计算节点模型输出的边决策；S44. Calculate the decision probability of the preliminary decision according to the normalization method, and select the preliminary decision with the largest decision probability value as the edge decision corresponding to the output of the calculation node model;

S45、遍历多种同源异构数据，重复步骤S42-S44，得到多个高级融合特征向量和多个边决策。S45. Traversing various homologous and heterogeneous data, repeating steps S42-S44, to obtain multiple high-level fusion feature vectors and multiple edge decisions.

优选地，S42中第一多层感知机对初级融合特征向量进行特征级融合，得到高级融合特征向量，具体公式为：Preferably, in S42, the first multi-layer perceptron performs feature-level fusion on the primary fusion feature vector to obtain the advanced fusion feature vector, and the specific formula is:

其中，

in,

式中，F_i'表示第i种同源异构数据的高级融合特征向量，F_i表示第i种同源异构数据的初级融合特征向量，W_l表示第l个可学习的特征映射矩阵，1≤l≤L，L表示特征映射矩阵的总数，f_ij ^t表示第i种第j个同源异构数据第t次循环后输出的初级融合特征，N表示第i种同源异构数据的总数。In the formula, F _i ' represents the high-level fusion feature vector of the i-th homologous and heterogeneous data, F _i represents the primary fusion feature vector of the i-th homologous and heterogeneous data, W _l represents the l-th learnable feature mapping matrix , 1≤l≤L, L represents the total number of feature mapping matrices, f _ij ^t represents the primary fusion feature output after the t-th cycle of the i-th homologous heterogeneous data, and N represents the i-th homologous heterogeneous total number of data.

优选地，S5中训练后的云边端数据融合模型中的云计算中心接收多个高级融合特征向量，通过云计算中心模型进行处理，得到云决策，具体包括：Preferably, the cloud computing center in the cloud-side-end data fusion model trained in S5 receives a plurality of advanced fusion feature vectors, processes them through the cloud computing center model, and obtains cloud decision-making, specifically including:

S51、云计算中心模型包括变换器网络、第二多层感知机和第二Softmax分类器，变换器网络用于对多个高级融合特征向量进行同维度编码和特征融合，相应得到多个决策级融合特征向量，第二多层感知机对多个决策级融合特征向量进行再融合，得到一个再融合特征向量，第二Softmax分类器对再融合特征向量进行分类，得到再融合特征向量中每个特征对应的最终决策；S51. The cloud computing center model includes a converter network, a second multi-layer perceptron and a second Softmax classifier, and the converter network is used to perform same-dimensional encoding and feature fusion on multiple advanced fusion feature vectors, and correspondingly obtain multiple decision-making levels Fusion feature vectors, the second multi-layer perceptron re-integrates multiple decision-level fusion feature vectors to obtain a re-fusion feature vector, and the second Softmax classifier classifies the re-fusion feature vectors to obtain each of the re-fusion feature vectors The final decision corresponding to the feature;

S52、根据归一化方法计算最终决策的决策概率，选取决策概率值最大的最终决策作为云决策。S52. Calculate the decision probability of the final decision according to the normalization method, and select the final decision with the largest decision probability value as the cloud decision.

优选地，S51中变换器网络用于对多个高级融合特征向量进行同维度编码和特征融合，相应得到多个决策级融合特征向量，具体公式为：Preferably, the converter network in S51 is used to perform same-dimensional encoding and feature fusion on multiple advanced fusion feature vectors, and correspondingly obtain multiple decision-level fusion feature vectors. The specific formula is:

式中，R_i表示第i种同源异构数据对应的决策级融合特征向量，F_oi'、F_pi'、F_qi'表示第i种同源异构数据对应的同维度编码后的高级融合特征向量，ReLU表示激活函数，d表示特征维度，W₁、W₂表示可学习的特征映射矩阵，b₁、b₂表示可学习的偏移量矩阵。In the formula, R _i represents the decision-level fusion feature vector corresponding to the i-th homologous and heterogeneous data, and F _oi ', F _pi ', F _qi 'denotes the same-dimension encoded high-level Fusion feature vector, ReLU represents activation function, d represents feature dimension, W ₁ and W ₂ represent learnable feature mapping matrix, b ₁ and b ₂ represent learnable offset matrix.

优选地，边缘计算平台模型包括多个并行设置的计算节点模型，边缘计算平台包括多个并行设置的计算节点，每个计算节点包括多台计算设备，每个计算节点对应一个计算节点模型，S3中采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，具体是对云边端数据融合模型中的边缘计算平台模型采用数据并行的训练方式进行训练，具体包括：Preferably, the edge computing platform model includes a plurality of computing node models arranged in parallel, the edge computing platform includes a plurality of computing nodes arranged in parallel, each computing node includes multiple computing devices, and each computing node corresponds to a computing node model, S3 A variety of homogeneous and heterogeneous data is used as the training set to train the cloud-edge-device data fusion model. Specifically, the edge computing platform model in the cloud-edge-device data fusion model is trained in a data-parallel training method, including:

S31、将训练集中的多种同源异构数据存储于边缘计算平台模型中的多个计算节点模型，每个计算节点模型存储一种同源异构数据；S31. Store various homologous and heterogeneous data in the training set in multiple computing node models in the edge computing platform model, and each computing node model stores a kind of homologous and heterogeneous data;

S32、针对边缘计算平台模型中的各计算节点模型，采用训练集中的多种同源异构数据进行训练：各计算节点模型利用各自存储的同源异构数据进行模型训练，当各计算节点模型的精度不再提升时，停止所有计算节点模型的参数更新；S32. For each computing node model in the edge computing platform model, use a variety of homologous and heterogeneous data in the training set for training: each computing node model uses the homologous and heterogeneous data stored separately for model training, when each computing node model When the accuracy of the model is no longer improved, stop updating the parameters of all computing node models;

S33、针对各计算节点模型对应的各计算节点中的多台计算设备，采用存储于各计算节点模型的同源异构数据进行训练：利用分布式数据并行技术，将同源异构数据均匀分配给各计算设备，并在各计算设备上加载本计算节点对应的计算节点模型，采用参数共享的方式同步并行更新各计算设备上的计算节点模型参数，保证各计算设备上的计算节点模型的一致性，当各计算设备上的计算节点模型的精度不再提升时，停止当前计算节点模型的训练。S33. For multiple computing devices in each computing node corresponding to each computing node model, use homogeneous and heterogeneous data stored in each computing node model for training: use distributed data parallel technology to evenly distribute homogeneous and heterogeneous data For each computing device, load the computing node model corresponding to the computing node on each computing device, and use the method of parameter sharing to update the computing node model parameters on each computing device synchronously and in parallel to ensure the consistency of the computing node models on each computing device When the accuracy of the computing node model on each computing device is no longer improved, stop the training of the current computing node model.

优选地，S3中采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，具体是对云边端数据融合模型采用模型并行的方式进行训练，具体包括：Preferably, S3 uses a variety of homogeneous and heterogeneous data as the training set to train the cloud-side data fusion model, specifically, the cloud-side data fusion model is trained in a model-parallel manner, specifically including:

S34、云边端数据融合模型的分布式模型前向推理：将训练集中的多种同源异构数据作为边缘计算平台模型中各计算节点模型的输入，经过计算推理，当边缘计算平台模型的精度不再变化时，得出边缘计算平台模型的计算推理结果，将边缘计算平台模型的计算推理结果作为云计算中心模型的输入，经过计算推理，当云计算中心模型的精度不再变化时，输出云计算中心模型的计算推理结果，当云计算中心模型计算推理结束，即云边端数据融合模型的前向推理过程结束，云计算中心模型的计算推理结果即为云边端数据融合模型的计算推理结果；S34. Distributed model forward reasoning of the cloud-edge-device data fusion model: use various homologous and heterogeneous data in the training set as the input of each computing node model in the edge computing platform model, and after calculation reasoning, when the edge computing platform model When the accuracy no longer changes, the calculation reasoning result of the edge computing platform model is obtained, and the calculation reasoning result of the edge computing platform model is used as the input of the cloud computing center model. After calculation reasoning, when the accuracy of the cloud computing center model no longer changes, Output the calculation and reasoning results of the cloud computing center model. When the calculation and reasoning of the cloud computing center model ends, that is, the forward reasoning process of the cloud-side-device data fusion model ends, the calculation and reasoning results of the cloud-computing center model are the results of the cloud-side-device data fusion model. Calculation of inference results;

S35、云边端数据融合模型的分布式模型参数更新：在云计算中心模型上，利用云计算中心模型的计算推理结果计算决策精度，并计算精度变化量，判断云计算中心模型精度是否增加，若精度增加，则计算云计算中心模型损失，并反向计算参数梯度和更新模云计算中心模型参数，当云计算中心模型的所有参数更新结束后，且云计算中心模型的精度不再增加时，将参数梯度传递给边缘计算平台模型，并在边缘计算平台模型中的各计算节点模型上计算参数梯度和更新各计算节点模型参数，当边缘计算平台模型中的所有计算节点模型的参数更新结束，即为云边端数据融合模型的参数更新结束，云边端数据融合模型的训练结束。S35. Update the distributed model parameters of the cloud-edge-end data fusion model: on the cloud computing center model, use the calculation and reasoning results of the cloud computing center model to calculate the decision-making accuracy, and calculate the accuracy variation to determine whether the accuracy of the cloud computing center model increases, If the accuracy increases, calculate the loss of the cloud computing center model, and reversely calculate the parameter gradient and update the model cloud computing center model parameters. When all the parameters of the cloud computing center model are updated and the accuracy of the cloud computing center model no longer increases , transfer the parameter gradient to the edge computing platform model, and calculate the parameter gradient on each computing node model in the edge computing platform model and update the parameters of each computing node model, when the parameter update of all computing node models in the edge computing platform model is completed , that is, the parameter update of the cloud-edge data fusion model is completed, and the training of the cloud-edge data fusion model is completed.

一种基于深度学习的云边端数据融合处理系统，采用基于深度学习的云边端数据融合处理方法对云边端数据进行融合处理，系统包括多个终端设备、边缘计算平台、边缘计算平台模型、云计算中心、云计算中心模型，边缘计算平台模型设置在边缘计算平台上，云计算中心模型设置在云计算中心上，多个终端设备均与边缘计算平台相连接，边缘计算平台和云计算中心相连接，云计算中心和多个终端设备连接，其中：A cloud-edge-end data fusion processing system based on deep learning, which adopts a cloud-edge-end data fusion processing method based on deep learning to fuse and process cloud-edge-end data. The system includes multiple terminal devices, edge computing platforms, and edge computing platform models , cloud computing center, cloud computing center model, the edge computing platform model is set on the edge computing platform, the cloud computing center model is set on the cloud computing center, multiple terminal devices are connected to the edge computing platform, the edge computing platform and cloud computing The center is connected, and the cloud computing center is connected with multiple terminal devices, among which:

多个终端设备用于获取多源异构数据，对多源异构数据进行处理，得到多种同源异构数据；Multiple terminal devices are used to obtain multi-source heterogeneous data, process multi-source heterogeneous data, and obtain various homogeneous and heterogeneous data;

边缘计算平台用于接收多种同源异构数据，通过边缘计算平台模型对多种同源异构数据进行处理，得到多个高级融合特征和对应的多个边决策，将多个边决策反馈给多个终端设备，控制多个终端设备对同源异构数据的采集，将多个边决策前馈给云计算中心；The edge computing platform is used to receive a variety of homogeneous and heterogeneous data, process a variety of homogeneous and heterogeneous data through the edge computing platform model, obtain multiple advanced fusion features and corresponding multiple side decisions, and feed back multiple side decisions For multiple terminal devices, control the collection of homogeneous and heterogeneous data by multiple terminal devices, and feed forward multiple side decisions to the cloud computing center;

云计算中心通过云计算中心模型对多个高级融合特征进行特征融合，得到决策级融合特征与云决策；云计算中心接收前馈的多个边决策控制云计算中心的云决策；云计算中心将云决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将云决策反馈给边缘计算平台，控制边缘计算平台的边决策。The cloud computing center performs feature fusion on multiple high-level fusion features through the cloud computing center model to obtain decision-level fusion features and cloud decision-making; the cloud computing center receives multiple feed-forward edge decisions to control the cloud decision-making of the cloud computing center; the cloud computing center will Cloud decision-making is fed back to multiple terminal devices to control the collection of multi-source heterogeneous data by multiple terminal devices, and cloud decision-making is fed back to the edge computing platform to control the side decision-making of the edge computing platform.

上述基于深度学习的云边端数据融合处理方法及系统，通过构建云边端协同体系，打造基于深度学习的数据融合模型，不仅能够有效解决多源异构数据所带来的通讯和计算问题，还能够提升体系的整体决策和调度能力，打破传统集中式云计算的限制，建立高效分布式数据处理体系，推动物联网全面发展，促进经济社会数字化转型，开启万物互联新时代。The above-mentioned deep learning-based cloud-side-device data fusion processing method and system can not only effectively solve the communication and computing problems caused by multi-source heterogeneous data, but also create a cloud-side-device collaboration system and a data fusion model based on deep learning. It can also improve the overall decision-making and scheduling capabilities of the system, break the limitations of traditional centralized cloud computing, establish an efficient distributed data processing system, promote the overall development of the Internet of Things, promote the digital transformation of the economy and society, and open a new era of Internet of Everything.

附图说明Description of drawings

图1为本发明一实施例中基于深度学习的云边端数据融合处理方法的流程图；Fig. 1 is a flow chart of a cloud-side-device data fusion processing method based on deep learning in an embodiment of the present invention;

图2为本发明一实施例中基于深度学习的云边端数据融合处理系统结构示意图；2 is a schematic structural diagram of a cloud-side-end data fusion processing system based on deep learning in an embodiment of the present invention;

图3为本发明一实施例中基于深度学习的云边端数据融合模型的训练流程图。Fig. 3 is a flow chart of training a cloud-edge-device data fusion model based on deep learning in an embodiment of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明的技术方案，下面结合附图对本发明作进一步的详细说明。In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings.

S5、训练后的云边端数据融合模型中的云计算中心接收多个高级融合特征向量，通过云计算中心模型进行处理，得到云决策；云计算中心根据前馈的多个边决策控制云计算中心的云决策；云计算中心将云决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将云决策反馈给边缘计算平台，控制边缘计算平台的边决策。S5. The cloud computing center in the trained cloud-edge-end data fusion model receives multiple high-level fusion feature vectors, processes them through the cloud computing center model, and obtains cloud decision-making; the cloud computing center controls cloud computing according to multiple feedforward edge decisions The cloud decision-making of the center; the cloud computing center feeds back the cloud decision-making to multiple terminal devices, controls the collection of multi-source heterogeneous data by multiple terminal devices, feeds back the cloud decision-making to the edge computing platform, and controls the side decision-making of the edge computing platform.

具体地，参见图1和图2，图1是本发明一实施例中的基于深度学习的云边端数据融合处理方法的流程图，图2为本发明一实施例中基于深度学习的云边端数据融合处理系统结构示意图。Specifically, referring to Fig. 1 and Fig. 2, Fig. 1 is a flow chart of a deep learning-based cloud edge data fusion processing method in an embodiment of the present invention, and Fig. 2 is a cloud edge data fusion processing method based on deep learning in an embodiment of the present invention Schematic diagram of the structure of the terminal data fusion processing system.

基于深度学习的云边端数据融合处理方法，包括：A cloud-side-end data fusion processing method based on deep learning, including:

1)搭建云边端数据融合模型：云边端数据融合模型包括多个终端设备、边缘计算平台、边缘计算平台模型、云计算中心、云计算中心模型，边缘计算平台模型设置在边缘计算平台上，云计算中心模型设置在云计算中心上，多个终端设备均和边缘计算平台、云计算中心连接，边缘计算平台和云计算中心连接；1) Build a cloud-edge-end data fusion model: the cloud-edge-end data fusion model includes multiple terminal devices, edge computing platforms, edge computing platform models, cloud computing centers, cloud computing center models, and edge computing platform models are set on edge computing platforms , the cloud computing center model is set on the cloud computing center, and multiple terminal devices are connected to the edge computing platform and the cloud computing center, and the edge computing platform is connected to the cloud computing center;

2)数据获取与预处理：利用各类传感器、手机、个人计算机等终端设备获取图像、视频、声音、文字等多源异构数据，为了保证多源异构数据间的同步性，采用软同步的方式获取具有同步频率的多源异构数据。为了避免或降低多源异构数据中每种同源异构数据的异构性对边缘计算平台模型中每个计算节点模型的影响，通常采用标准化方法(例如Z-Score)进行预处理，以降低同源异构数据密度、概率分布及内属性的相关性之间的差异，消除同源异构数据的不等价性。2) Data acquisition and preprocessing: Use various sensors, mobile phones, personal computers and other terminal devices to acquire multi-source heterogeneous data such as images, videos, sounds, texts, etc., in order to ensure the synchronization between multi-source heterogeneous data, use soft synchronization The way to obtain multi-source heterogeneous data with synchronous frequency. In order to avoid or reduce the influence of the heterogeneity of each homogeneous heterogeneous data in the multi-source heterogeneous data on each computing node model in the edge computing platform model, a standardized method (such as Z-Score) is usually used for preprocessing to Reduce the difference between homologous and heterogeneous data density, probability distribution and correlation of internal attributes, and eliminate the inequalities of homologous and heterogeneous data.

3)采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，反向传播并更新参数，直至得到训练后的云边端数据融合模型，云边端数据融合模型训练结束后，将多个终端设备采集并处理后的多种同源异构数据作为数据集输入至训练后的云边端数据融合模型；3) Use a variety of homogeneous and heterogeneous data as the training set to train the cloud-side data fusion model, backpropagate and update parameters until the trained cloud-side data fusion model is obtained, and the cloud-side data fusion model training is over Finally, a variety of homogeneous and heterogeneous data collected and processed by multiple terminal devices are input as data sets into the trained cloud-side-end data fusion model;

4)训练后的云边端数据融合模型中的边缘计算平台接收数据集中的多种同源异构数据，通过边缘计算平台模型进行处理，得到多个高级融合特征向量和多个边决策，将得到的多个高级融合特征向量传输给云计算中心，将得到的多个边决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将多个边决策前馈给云计算中心；4) The edge computing platform in the trained cloud-edge-end data fusion model receives a variety of homogeneous and heterogeneous data in the data set, and processes them through the edge computing platform model to obtain multiple advanced fusion feature vectors and multiple edge decisions. The obtained multiple high-level fusion feature vectors are transmitted to the cloud computing center, and the obtained multiple side decisions are fed back to multiple terminal devices to control the collection of multi-source heterogeneous data by multiple terminal devices, and the multiple side decisions are fed forward to the cloud computing center;

5)训练后的云边端数据融合模型中的云计算中心接收多个高级融合特征向量，通过云计算中心模型进行处理，得到云决策；云计算中心根据前馈的多个边决策控制云计算中心的云决策；云计算中心将云决策反馈给多个终端设备，控制多个终端设备对多源异构数据的采集，将云决策反馈给边缘计算平台，控制边缘计算平台中的边决策。5) The cloud computing center in the trained cloud-edge-end data fusion model receives multiple high-level fusion feature vectors, processes them through the cloud computing center model, and obtains cloud decision-making; the cloud computing center controls cloud computing based on multiple feedforward edge decisions The cloud decision-making of the center; the cloud computing center feeds back the cloud decision-making to multiple terminal devices, controls the collection of multi-source heterogeneous data by multiple terminal devices, feeds back the cloud decision-making to the edge computing platform, and controls the edge decision-making in the edge computing platform.

在一个实施例中，S2中利用多个终端设备获取多源异构数据后，对多源异构数据进行预处理，得到多种同源异构数据，具体过程包括：In one embodiment, after multiple terminal devices are used in S2 to obtain multi-source heterogeneous data, the multi-source heterogeneous data is preprocessed to obtain various homogeneous and heterogeneous data. The specific process includes:

在一个实施例中，S23中得到多种同源异构数据，具体公式为：In one embodiment, multiple homologous and heterogeneous data are obtained in S23, and the specific formula is:

具体地，首先利用软同步的方式获取带有时间戳的相同采样频率的多源异构数据，并通过时间戳采样和线性插值的方式进行校准，得到同步频率数据，以消除采用时间偏差带来的影响；然后对多源异构数据中的每种同源异构数据，采用Z-Score标准化方法进行处理，降低同源异构数据的密度、概率分布及内属性的相关性之间的差异，消除同源异构数据的不等价性：Specifically, first use soft synchronization to acquire multi-source heterogeneous data with the same sampling frequency with time stamps, and calibrate them through time stamp sampling and linear interpolation to obtain synchronous frequency data, so as to eliminate the problems caused by time deviation. influence; then for each type of heterogeneous data in multi-source heterogeneous data, use the Z-Score standardization method to reduce the difference between the density, probability distribution and correlation of internal attributes of homogeneous and heterogeneous data , to eliminate the inequalities of homogeneous and heterogeneous data:

在一个实施例中，边缘计算平台模型包括多个并行设置的计算节点模型，S4中训练后的云边端数据融合模型中的边缘计算平台接收数据集中的多种同源异构数据后，通过边缘计算平台模型进行处理，得到多个高级融合特征向量和多个边决策，具体包括：In one embodiment, the edge computing platform model includes a plurality of computing node models set in parallel, after the edge computing platform in the cloud-edge-end data fusion model trained in S4 receives various homologous and heterogeneous data in the data set, it passes The edge computing platform model is processed to obtain multiple advanced fusion feature vectors and multiple edge decisions, including:

在一个实施例中，S42中第一多层感知机对初级融合特征向量进行特征级融合，得到高级融合特征向量，具体公式为：In one embodiment, in S42, the first multi-layer perceptron performs feature-level fusion on the primary fusion feature vector to obtain the high-level fusion feature vector, and the specific formula is:

其中，

in,

具体地，可采用神经网络对数据集中的多种同源异构数据进行特征提取：针对一维同源异构数据，可采用循环神经网络RNN进行一维特征提取，得到一维特征，也就是一维特征向量；针对多维同源异构数据，可采用卷积神经网络CNN进行多维特征提取，得到多维特征。Specifically, the neural network can be used to extract the features of various homologous and heterogeneous data in the data set: for the one-dimensional homologous and heterogeneous data, the cyclic neural network RNN can be used for one-dimensional feature extraction to obtain one-dimensional features, that is, One-dimensional feature vector; for multi-dimensional homologous and heterogeneous data, convolutional neural network (CNN) can be used for multi-dimensional feature extraction to obtain multi-dimensional features.

以采用循环神经网络RNN对一维同源异构数据进行一维特征提取为例，特征提取方法如下：Taking the one-dimensional feature extraction of one-dimensional homologous heterogeneous data by using the recurrent neural network RNN as an example, the feature extraction method is as follows:

其中，

为第i种第j个同源异构数据第t次循环后输出的特征(x_ij'相当于

在t＝0时刻的同源异构数据)，

为第i种第j个同源异构数据第t-1次循环后输出的特征，X^t表示第t次循环后的输入特征，V为映射矩阵，Relu表示激活函数，t表示第t次循环。in,

is the feature output after the t-th cycle of the j-th homologous heterogeneous data of the i-th type (x _ij ' is equivalent to

Homogeneous data at time t=0),

is the feature output after the t-1th cycle of the i-th homologous heterogeneous data, X ^t represents the input feature after the t-th cycle, V is the mapping matrix, Relu represents the activation function, and t represents the t-th cycle cycle.

采用全局平均池化方法GAP将多维特征转化为一维特征向量，并采用串级的方法进行特征拼接和初步融合，在保证同源异构数据特征可融合的同时，还可以解决异构数据之间因信息密度导致的特征向量长度不同的问题：The global average pooling method GAP is used to convert multi-dimensional features into one-dimensional feature vectors, and a cascade method is used for feature splicing and preliminary fusion. While ensuring the fusion of homogeneous and heterogeneous data features, it can also solve the problem of heterogeneous data. The problem of different eigenvector lengths due to information density:

其中，F_i表示第i种同源异构数据的初级融合特征向量，F_i为一个1*N的向量，

表示第i种第j个同源异构数据第t次循环后输出的初级融合特征，即第i种第j个同源异构数据x_ij'第t次循环后输出的特征，N表示不同终端设备提供的同源异构数据的总数，1≤j≤N。Among them, F _i represents the primary fusion feature vector of the i-th homologous and heterogeneous data, and F _i is a 1*N vector,

Indicates the primary fusion feature output after the t-th cycle of the i-th homologous heterogeneous data of the i-th type, that is, the output feature of the j-th homologous heterogeneous data x _ij ' of the i-th type after the t-th cycle, N means different The total number of homogeneous and heterogeneous data provided by the terminal device, 1≤j≤N.

利用多层感知机MLP对每种同源异构数据的初级融合特征向量进行特征级融合，得到高级融合特征向量：The multi-layer perceptron MLP is used to perform feature-level fusion on the primary fusion feature vectors of each homologous and heterogeneous data, and the advanced fusion feature vectors are obtained:

其中，F_i'表示第i种同源异构数据的高级融合特征向量，F_i'也是一个1*N的向量，

表示第i种第j个同源异构数据第t次循环后输出的高级融合特征，M_il表示第i种同源异构数据第l个可学习的特征映射矩阵，l＝1,2,...,L，L表示特征映射矩阵的总数，代表使用L层的MLP。Among them, F _i 'represents the high-level fusion feature vector of the i-th homologous and heterogeneous data, and F _i 'is also a 1*N vector,

Indicates the advanced fusion feature output after the t-th cycle of the i-th homologous heterogeneous data of the i-th type, M _il represents the l-th learnable feature mapping matrix of the i-th homologous heterogeneous data, l=1,2, ..., L, where L represents the total number of feature map matrices, representing an MLP using L layers.

通过第一Softmax分类器对高级融合特征向量进行分类，得到高级融合特征向量中每个特征对应的初步决策。The high-level fusion feature vector is classified by the first Softmax classifier to obtain a preliminary decision corresponding to each feature in the high-level fusion feature vector.

通过归一化方法计算初步决策的决策概率：Calculate the decision probability of the preliminary decision by the normalization method:

式中，S_ij为第i种第j个同源异构数据的高级融合特征

对应的边决策的决策概率值，0≤S_ij≤1。In the formula, S _ij is the advanced fusion feature of the i-th type j-th homologous heterogeneous data

The decision probability value of the corresponding edge decision, 0≤S _ij ≤1.

选取决策概率值最大的初步决策作为第i个计算节点模型的边决策，也就是第i种同源异构数据对应的边决策，将边决策反馈给多个不同的终端设备，对人机交互式设备终端进行弹窗提示，提醒用户上传相应数据，对主动采集式设备终端发布指令，控制数据采集；将第i种高级融合特征向量F_i'和第i个计算节点模型的边决策上传给云计算中心。Select the preliminary decision with the largest decision probability value as the side decision of the i-th computing node model, that is, the side decision corresponding to the i-th kind of homogeneous and heterogeneous data, and feed back the side decision to multiple different terminal devices. The type device terminal will prompt a pop-up window to remind the user to upload the corresponding data, and issue instructions to the active collection type device terminal to control data collection; upload the i-th advanced fusion feature vector F _i ' and the edge decision of the i-th computing node model to cloud computing center.

在一个实施例中，S5中训练后的云边端数据融合模型中的云计算中心接收多个高级融合特征向量，通过云计算中心模型进行处理，得到云决策，具体包括：In one embodiment, the cloud computing center in the cloud-edge-end data fusion model trained in S5 receives a plurality of advanced fusion feature vectors, processes them through the cloud computing center model, and obtains cloud decision-making, specifically including:

在一个实施例中，S51中变换器网络用于对多个高级融合特征向量进行同维度编码和特征融合，相应得到多个决策级融合特征向量，具体公式为：In one embodiment, the converter network in S51 is used to perform same-dimensional encoding and feature fusion on multiple advanced fusion feature vectors, and correspondingly obtain multiple decision-level fusion feature vectors. The specific formula is:

具体地，对边缘计算平台模型中的各计算节点模型提供的高级融合特征向量首先进行独热编码，将其转化为只有一位激活的稀疏数据，可有效解决多源数据间决策难以分类的问题，并且有效扩充特征；再利用PCA降维操作，降低数据维度从而缓解维度灾难，并且可以有效降低数据噪声并保证特征独立，帮助后续神经网络对特征的学习与提取，并完成对多个决策概率特征分别进行同维度编码处理。利用变换器网络Transformer内的自注意力机制获取同源异构数据之间的相关性，对同源异构数据数据进行决策级融合，得到同源异构数据对应的决策级融合特征向量：Specifically, the advanced fusion feature vector provided by each computing node model in the edge computing platform model is firstly one-hot encoded, and transformed into sparse data with only one bit of activation, which can effectively solve the problem of difficult classification of decision-making between multi-source data , and effectively expand the features; then use the PCA dimension reduction operation to reduce the data dimension to alleviate the dimension disaster, and can effectively reduce data noise and ensure feature independence, help the subsequent neural network to learn and extract features, and complete multiple decision probabilities The features are encoded in the same dimension separately. Use the self-attention mechanism in the transformer network Transformer to obtain the correlation between homogeneous and heterogeneous data, and perform decision-level fusion on homologous and heterogeneous data to obtain the decision-level fusion feature vector corresponding to homologous and heterogeneous data:

式中，R_i表示第i种同源异构数据对应的决策级融合特征，ReLU表示激活函数，用于引入非线性因素，提高神经网络的表达能力，F_oi'、F_pi'、F_qi'表示第i种同源异构数据对应的同维度编码后的高级融合特征向量，d表示特征维度，用于避免因多次乘法运算后数值较大导致的梯度爆炸问题，W₁、W₂表示可学习的特征映射矩阵，b₁、b₂表示可学习的偏移量矩阵。In the formula, R _i represents the decision-level fusion feature corresponding to the i-th homologous heterogeneous data, ReLU represents the activation function, which is used to introduce nonlinear factors and improve the expressive ability of the neural network, F _oi ', F _pi ', F _qi 'indicates the high-level fusion feature vector encoded in the same dimension corresponding to the i-th homologous and heterogeneous data, d indicates the feature dimension, which is used to avoid the problem of gradient explosion caused by large values after multiple multiplication operations, W ₁ , W ₂ represents a learnable feature mapping matrix, and b ₁ and b ₂ represent learnable offset matrices.

利用第二多层感知机MLP对多个决策级融合特征向量进行再融合，得到一个再融合特征向量，以降低融合特征维度，再利用第二Softmax分类器对再融合特征向量进行分类，得到再融合特征向量中每个特征对应的最终决策，根据归一化法将多个最终决策转化为范围为0～1和为1的概率分布，具体计算方法同根据归一化方法计算多个初步决策中每个初步决策的决策概率，此处不再赘述。然后根据概率分布选取决策概率值最大的最终决策作为云决策，并将云决策反馈给多个终端设备，进行大规模终端设备总领性控制，控制其对多源异构数据的采集；将云决策反馈给边缘计算平台，控制同源异构数据上传。Use the second multi-layer perceptron MLP to re-fuse multiple decision-level fusion feature vectors to obtain a re-fusion feature vector to reduce the dimension of the fusion features, and then use the second Softmax classifier to classify the re-fusion feature vectors to obtain a re-fusion feature vector The final decision corresponding to each feature in the fusion feature vector is converted into a probability distribution ranging from 0 to 1 and 1 according to the normalization method. The specific calculation method is the same as the calculation of multiple preliminary decisions based on the normalization method The decision probability of each preliminary decision in , will not be repeated here. Then, according to the probability distribution, the final decision with the highest decision probability value is selected as the cloud decision, and the cloud decision is fed back to multiple terminal devices to carry out general control of large-scale terminal equipment and control its collection of multi-source heterogeneous data; The decision is fed back to the edge computing platform to control the upload of homogeneous and heterogeneous data.

在一个实施例中，边缘计算平台模型包括多个并行设置的计算节点模型，边缘计算平台包括多个并行设置的计算节点，每个计算节点包括多台计算设备，每个计算节点对应一个计算节点模型，S3中采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，具体是对云边端数据融合模型中的边缘计算平台模型采用数据并行的训练方式进行训练，具体包括：In one embodiment, the edge computing platform model includes a plurality of computing node models arranged in parallel, the edge computing platform includes a plurality of computing nodes arranged in parallel, each computing node includes a plurality of computing devices, and each computing node corresponds to a computing node Model, S3 uses a variety of homogeneous and heterogeneous data as the training set to train the cloud-edge-device data fusion model. Specifically, the edge computing platform model in the cloud-edge-device data fusion model is trained using a data parallel training method. Specifically include:

S33、针对各计算节点模型对应的各计算节点中的多台计算设备，采用存储于各计算节点模型的同源异构数据进行训练：利用分布式数据并行技术，将同源异构数据均匀分配给各计算设备，并在各计算设备上加载本计算节点对应的计算节点模型，采用参数共享的方式同步并行更新各计算设备上的计算节点模型参数，保证各计算设备上的计算节点模型的一致性，当各计算设备上的计算节点模型的精度不再提升时，停止当前计算节点模型的训练。S33. For multiple computing devices in each computing node corresponding to each computing node model, use homogeneous and heterogeneous data stored in each computing node model for training: use distributed data parallel technology to evenly distribute homogeneous and heterogeneous data For each computing device, load the computing node model corresponding to the computing node on each computing device, and use the method of parameter sharing to update the computing node model parameters on each computing device synchronously and in parallel to ensure the consistency of the computing node model on each computing device When the accuracy of the computing node model on each computing device is no longer improved, stop the training of the current computing node model.

具体地，将多个不同的终端设备获取多源异构数据解耦成多种同源异构数据，并分布式存储于边缘计算平台模型中的各计算节点模型，每个计算节点模型存储一种同源异构数据，这样有利于减少数据访问和传输；对边缘计算平台模型采用多种同源异构数据并行训练：各计算节点模型利用存储于各节点模型的同源异构数据进行训练，且节点间不进行相互通讯和数据传输，当各节点模型的精度不再提升时停止所有节点模型的参数更新；在边缘计算平台模型内的单个计算节点模型中，采用对应的同源异构数据并行训练：利用分布式数据并行DDP技术，将训练数据同步分配给计算节点中的各计算设备，并采用参数共享的方式同步并行更新该节点模型的参数，当该计算节点模型的精度不提升时停止训练。Specifically, the multi-source heterogeneous data obtained by multiple different terminal devices is decoupled into a variety of homogeneous heterogeneous data, which are distributed and stored in each computing node model in the edge computing platform model, and each computing node model stores a A variety of homogeneous and heterogeneous data, which is conducive to reducing data access and transmission; the edge computing platform model uses a variety of homologous and heterogeneous data parallel training: each computing node model uses the homologous and heterogeneous data stored in each node model for training , and there is no mutual communication and data transmission between nodes. When the accuracy of each node model is no longer improved, the parameter update of all node models is stopped; in the single computing node model in the edge computing platform model, the corresponding homologous heterogeneous Data parallel training: Use distributed data parallel DDP technology to synchronously distribute training data to each computing device in the computing node, and use parameter sharing to update the parameters of the node model synchronously and in parallel. When the accuracy of the computing node model does not improve stop training.

其中DDP技术如下：The DDP technology is as follows:

1)启动多个进程来负责多个计算节点中的多台计算设备，每个进程在每台计算设备上加载对应节点的计算节点模型，并且根据节点内各计算设备的性能分配训练数据和训练任务。1) Start multiple processes to be responsible for multiple computing devices in multiple computing nodes. Each process loads the computing node model of the corresponding node on each computing device, and allocates training data and training data according to the performance of each computing device in the node. Task.

2)当每个计算节点模型进行一次前向推理过程后，将每个计算节点模型的计算结果传递到该节点中的每一台计算设备，进行结果分析并计算训练损失，完成后将训练损失传递给该计算节点模型，对该计算节点中的各计算设备进行梯度计算和模型参数更新，当所有计算设备参数更新结束后即当前计算节点模型完成一次模型参数更新。2) After each computing node model performs a forward inference process, the calculation result of each computing node model is transmitted to each computing device in the node, the result is analyzed and the training loss is calculated, and the training loss is calculated after completion Pass it to the computing node model, and perform gradient calculation and model parameter update for each computing device in the computing node. When all computing device parameters are updated, the current computing node model completes a model parameter update.

3)由于每个计算节点中的多台计算设备上的模型结构和初始化参数相同，并且采用相同训练损失和训练策略进行梯度计算和参数更新，所以训练结束后，每个计算节点模型中的多台计算设备上模型参数完全相同。3) Since the model structure and initialization parameters on multiple computing devices in each computing node are the same, and the same training loss and training strategy are used for gradient calculation and parameter update, after the training, the multiple computing devices in each computing node model The model parameters are exactly the same on each computing device.

在一个实施例中，S3中采用多种同源异构数据作为训练集对云边端数据融合模型进行训练，具体是对云边端数据融合模型采用模型并行的方式进行训练，具体包括：In one embodiment, S3 uses various homogeneous and heterogeneous data as the training set to train the cloud-side data fusion model, specifically, the cloud-side data fusion model is trained in a model-parallel manner, specifically including:

具体地，参见图3，图3为本发明一实施例中基于深度学习的云边端数据融合模型的训练流程图。Specifically, referring to FIG. 3 , FIG. 3 is a flowchart of training a cloud-edge-device data fusion model based on deep learning in an embodiment of the present invention.

通过模型并行的训练方式，采用多种同源异构数据对边缘计算平台模型和云计算中心模型进行参数微调，完成云边端数据融合模型的训练：Through the model parallel training method, a variety of homogeneous and heterogeneous data are used to fine-tune the parameters of the edge computing platform model and the cloud computing center model, and complete the training of the cloud edge data fusion model:

1)云边端数据融合模型的分布式模型前向推理：边缘计算平台模型利用同源异构数据作为各计算节点模型的输入，计算模型结果、决策精度以及精度变化量，根据精度变化量进行判断，当精度变化量大于零,此时需要计算模型损失和参数梯度，并更新模型参数，并再次利用同源异构数据作为各计算节点模型的输入，并进行上述计算推理(计算模型结果、决策精度以及精度变化量，根据精度变化量进行判断)，直至计算节点模型的精度不再变化，该计算节点模型的计算推理结束。当所有计算节点模型的计算推理结束后，将边缘计算平台模型的计算推理结果传输给云计算中心模型并作为输入，并进行计算推理(具体计算推理方法同前面各计算节点模型的计算推理方法，此处不再赘述)，当云计算中心模型计算推理结束，即云边端数据融合模型的前向推理过程结束。其中前向计算推理过程的公式化表达如下：1) Distributed model forward reasoning of the cloud-edge-device data fusion model: the edge computing platform model uses homogeneous and heterogeneous data as the input of each computing node model, and calculates the model results, decision-making accuracy, and accuracy variation. Judgment, when the accuracy change is greater than zero, it is necessary to calculate the model loss and parameter gradient, and update the model parameters, and use the same source and heterogeneous data as the input of each computing node model again, and perform the above calculation reasoning (calculation model results, Decision-making accuracy and accuracy variation are judged according to the accuracy variation), until the accuracy of the computing node model no longer changes, the computing reasoning of the computing node model ends. After the calculation and reasoning of all computing node models is completed, the calculation and reasoning results of the edge computing platform model are transmitted to the cloud computing center model as input, and the calculation and reasoning are performed (the specific calculation and reasoning method is the same as the calculation and reasoning method of each computing node model above, No more details here), when the calculation and reasoning of the cloud computing center model ends, that is, the forward reasoning process of the cloud-side data fusion model ends. The formula expression of the forward calculation reasoning process is as follows:

y＝(U₁(x₁₁')+U₁(x₁₂')+···+U₁(x_1j'))+···+(U_i(x_i1')+U_i(x_i2')+···+U_i(x_ij'))y＝(U ₁ (x ₁₁ ')+U ₁ (x ₁₂ ')+···+U ₁ (x _1j '))+···+(U _i (x _i1 ')+U _i (x _i2 ')+···+U _i (x _ij '))

其中，y表示云边端数据融合模型的最终计算结果，Y表示云计算中心模型，U_i表示边缘计算平台模型中的第i个计算节点模型，u_i表示边缘计算平台模型中第i个计算节点模型的计算结果，即第i种同源异构数据的计算结果，x_ij'表示第i个计算节点模型的输入，即第i种同源异构数据中的第j个同源异构数据,在此需要说明的是，假设通过N个不同的终端设备获取N种多源异构数据，经过处理，得到不超过N种的同源异构数据，此时边缘计算平台模型中的计算节点的数量满足“每个计算节点处理一种同源异构数据”即可，不一定也是N个。Among them, y represents the final calculation result of the cloud-edge data fusion model, Y represents the cloud computing center model, U _i represents the i-th computing node model in the edge computing platform model, u _i represents the i-th computing node model in the edge computing platform model The calculation result of the node model, that is, the calculation result of the i-th homologous heterogeneous data, x _ij ' represents the input of the i-th computing node model, that is, the j-th homologous heterogeneous data in the i-th homologous heterogeneous data Data, what needs to be explained here is that, assuming that N types of multi-source heterogeneous data are obtained through N different terminal devices, after processing, no more than N types of homogeneous and heterogeneous data are obtained. At this time, the calculation in the edge computing platform model The number of nodes only needs to satisfy "each computing node processes a kind of homogeneous and heterogeneous data", not necessarily N.

2)云边端数据融合模型的分布式模型参数更新：在云计算中心模型上，利用云边端数据融合模型的推理结果计算决策精度，并计算精度变化量，判断云计算中心模型精度是否增加，若精度增加，则计算云计算中心模型损失，并反向计算参数梯度和更新模云计算中心模型参数，当云计算中心模型的所有参数更新结束后，且云计算中心模型的精度不再增加时，将参数梯度传递给边缘计算平台模型，并在各计算节点模型上计算参数梯度和更新模型参数，当边缘计算平台上模型中的所有计算节点模型的参数更新结束，即云边端数据融合模型的参数更新结束，云边端数据融合模型的训练结束。其中计算参数梯度过程的公式化表达如下：2) Update the distributed model parameters of the cloud-side-end data fusion model: on the cloud-side-end data fusion model, use the reasoning results of the cloud-side-end data fusion model to calculate the decision-making accuracy, and calculate the accuracy variation, and judge whether the accuracy of the cloud-side-end data fusion model has increased , if the accuracy increases, calculate the loss of the cloud computing center model, and reversely calculate the parameter gradient and update the cloud computing center model parameters. When all the parameters of the cloud computing center model are updated, and the accuracy of the cloud computing center model will not increase When , pass the parameter gradient to the edge computing platform model, and calculate the parameter gradient and update the model parameters on each computing node model. The parameter update of the model is completed, and the training of the cloud-side data fusion model is completed. The formula for calculating the parameter gradient process is as follows:

其中，Y'表示云计算中心模型的参数梯度，y表示云边端数据融合模型的最终计算结果，U_i'表示边缘计算平台模型中第i个计算节点模型的参数梯度，u_i表示边缘计算平台模型中第i个计算节点模型的计算结果，x_ij'表示第i个计算节点模型输入，即第i种同源异构数据中的第j个同源异构数据。Among them, Y' represents the parameter gradient of the cloud computing center model, y represents the final calculation result of the cloud-edge data fusion model, U _i ' represents the parameter gradient of the i-th computing node model in the edge computing platform model, and u _i represents the edge computing The calculation result of the i-th computing node model in the platform model, x _ij ' represents the input of the i-th computing node model, that is, the j-th homologous and heterogeneous data in the i-th homologous and heterogeneous data.

在一个实施例中，云计算中心包括多台并行的计算设备，利用边缘计算平台模型的计算推理结果对云计算中心模型进行训练，具体包括：In one embodiment, the cloud computing center includes multiple parallel computing devices, and the cloud computing center model is trained by using the calculation and reasoning results of the edge computing platform model, specifically including:

1)利用边缘计算平台模型中的各计算节点模型的推理结果构建云计算中心模型的训练数据；1) Utilize the inference results of each computing node model in the edge computing platform model to construct the training data of the cloud computing center model;

2)采用DDP技术，将训练数据同步分配给云计算中心中的各计算设备，并采用参数共享的方式同步并行更新模型参数，当云计算中心模型的精度不再提升时停止训练。2) Using DDP technology, the training data is synchronously distributed to each computing device in the cloud computing center, and the model parameters are updated synchronously and in parallel by means of parameter sharing, and the training is stopped when the accuracy of the cloud computing center model no longer improves.

在一个实施例中，一种基于深度学习的云边端数据融合处理系统，采用基于深度学习的云边端数据融合处理方法对云边端数据进行融合处理，系统包括多个终端设备、边缘计算平台、边缘计算平台模型、云计算中心、云计算中心模型，边缘计算平台模型设置在边缘计算平台上，云计算中心模型设置在云计算中心上，多个终端设备均与边缘计算平台相连接，边缘计算平台和云计算中心相连接，云计算中心和多个终端设备连接，其中：In one embodiment, a deep learning-based cloud-side-device data fusion processing system adopts a deep-learning-based cloud-side-device data fusion processing method to perform fusion processing on cloud-side-device data. The system includes multiple terminal devices, edge computing Platform, edge computing platform model, cloud computing center, cloud computing center model, the edge computing platform model is set on the edge computing platform, the cloud computing center model is set on the cloud computing center, multiple terminal devices are connected to the edge computing platform, The edge computing platform is connected to the cloud computing center, and the cloud computing center is connected to multiple terminal devices, among which:

关于基于深度学习的云边端数据融合处理系统的具体限定可以参见上文中对于基于深度学习的云边端数据融合处理方法的限定，在此不再赘述。For the specific limitations of the cloud-side-device data fusion processing system based on deep learning, please refer to the above-mentioned definition of the cloud-side-device data fusion processing method based on deep learning, and will not be repeated here.

以上对本发明所提供的一种基于深度学习的云边端数据融合处理方法和系统进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的核心思想。应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以对本发明进行若干改进和修饰，这些改进和修饰也落入本发明权利要求的保护范围内。The above is a detailed introduction of the deep learning-based cloud-edge-device data fusion processing method and system provided by the present invention. In this paper, specific examples are used to illustrate the principles and implementation modes of the present invention, and the descriptions of the above embodiments are only used to help understand the core idea of the present invention. It should be pointed out that for those skilled in the art, without departing from the principle of the present invention, some improvements and modifications can be made to the present invention, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims

1. The cloud edge data fusion processing method based on deep learning is characterized by comprising the following steps of:

S1, a cloud edge data fusion model is built, the cloud edge data fusion model comprises a plurality of terminal devices, an edge computing platform model, a cloud computing center and a cloud computing center model, the edge computing platform model is arranged on the edge computing platform, the cloud computing center model is arranged on the cloud computing center, a plurality of terminal devices are connected with the edge computing platform, the edge computing platform is connected with the cloud computing center, and the cloud computing center is connected with a plurality of terminal devices;

s2, preprocessing the multi-source heterogeneous data after the multi-source heterogeneous data are acquired by utilizing a plurality of terminal devices to obtain a plurality of kinds of homologous heterogeneous data;

s3, training the cloud edge data fusion model by using a plurality of homologous heterogeneous data as a training set, and reversely propagating and updating parameters until a trained cloud edge data fusion model is obtained, and after the training of the cloud edge data fusion model is finished, inputting a plurality of homologous heterogeneous data acquired and processed by a plurality of terminal devices into the trained cloud edge data fusion model as a data set;

S4, after the edge computing platform in the trained cloud edge data fusion model receives various homologous heterogeneous data in the data set, processing the data set through the edge computing platform model to obtain a plurality of high-level fusion feature vectors and a plurality of edge decisions, transmitting the high-level fusion feature vectors to the cloud computing center, feeding back the edge decisions to the terminal devices, controlling the terminal devices to acquire the multi-source heterogeneous data, and feeding the edge decisions to the cloud computing center;

s5, a cloud computing center in the trained cloud side data fusion model receives a plurality of high-level fusion feature vectors, and the cloud computing center model is used for processing the high-level fusion feature vectors to obtain a cloud decision; the cloud computing center controls cloud decisions of the cloud computing center according to the feedforward edge decisions; the cloud computing center feeds back the cloud decision to a plurality of terminal devices, controls the acquisition of multi-source heterogeneous data by the plurality of terminal devices, feeds back the cloud decision to the edge computing platform, and controls the edge decision of the edge computing platform.

2. The cloud end data fusion processing method based on deep learning as claimed in claim 1, wherein after the multi-source heterogeneous data is obtained by using a plurality of terminal devices in the step S2, the multi-source heterogeneous data is preprocessed to obtain a plurality of kinds of homologous heterogeneous data, and the specific process includes:

S21, acquiring multi-source heterogeneous data by using a plurality of terminal devices;

s22, calibrating the multi-source heterogeneous data into multi-source heterogeneous data with synchronous frequency in a time stamp sampling and linear interpolation mode;

s23, carrying out standardization processing on the multi-source heterogeneous data with the synchronous frequency to obtain various homologous heterogeneous data.

3. The cloud end data fusion processing method based on deep learning as claimed in claim 2, wherein the step S23 is characterized in that a plurality of homologous heterogeneous data are obtained by a specific formula:

wherein ,x_ij ' represents the j-th homologous heterogeneous data in the i-th homologous heterogeneous data, x _ij Represents the j-th pre-normalization homolog isomer data in the i-th homolog isomer data, j=1, 2,.. _i 、δ _i Mean and variance of the ensemble of the ith homologous heterogeneous data are shown.

4. The method for processing cloud edge data fusion based on deep learning according to claim 3, wherein the edge computing platform model comprises a plurality of computing node models arranged in parallel, and the edge computing platform in the trained cloud edge data fusion model in S4 receives the plurality of homologous heterogeneous data in the data set and then processes the plurality of homologous heterogeneous data through the edge computing platform model to obtain a plurality of high-level fusion feature vectors and a plurality of edge decisions, and the method specifically comprises:

S41, storing multiple homologous heterogeneous data in the data set in multiple computing node models, wherein one computing node model stores one type of homologous heterogeneous data;

s42, the computing node model comprises a neural network, a first multi-layer perceptron and a first Softmax classifier, wherein the neural network is used for extracting features of homologous heterogeneous data stored in the corresponding computing node model to obtain primary fusion feature vectors, and the first multi-layer perceptron performs feature level fusion on the primary fusion feature vectors to obtain advanced fusion feature vectors;

s43, the first Softmax classifier is used for classifying the high-level fusion feature vector to obtain a preliminary decision corresponding to each feature in the high-level fusion feature vector;

s44, calculating the decision probability of the preliminary decision according to a normalization method, and selecting the preliminary decision with the maximum decision probability value as an edge decision output by a corresponding calculation node model;

s45, traversing a plurality of homologous heterogeneous data, and repeating the steps S42-S44 to obtain a plurality of high-level fusion feature vectors and a plurality of edge decisions.

5. The method for processing cloud end data fusion based on deep learning as claimed in claim 4, wherein in the step S42, the first multi-layer perceptron performs feature level fusion on the primary fusion feature vector to obtain an advanced fusion feature vector, and the specific formula is as follows:

wherein ,

in the formula ,F_i ' high-level fusion eigenvector representing ith homologous heterogeneous data, F _i Primary fusion feature vector representing ith homologous heterogeneous data, W _l Represents the first learnable feature mapping matrix, L is more than or equal to 1 and less than or equal to L, L represents the total number of the feature mapping matrices, and f _ij ^t Represents the primary fusion characteristics of the ith and jth homologous heterogeneous data output after the jth cycle, and N represents the total number of the ith homologous heterogeneous data.

6. The deep learning-based cloud edge data fusion processing method of claim 5, wherein a cloud computing center in the trained cloud edge data fusion model in S5 receives a plurality of the advanced fusion feature vectors, and processes the advanced fusion feature vectors through a cloud computing center model to obtain cloud decisions, and the method specifically comprises the following steps:

s51, the cloud computing center model comprises a converter network, a second multi-layer perceptron and a second Softmax classifier, wherein the converter network is used for carrying out same-dimensional coding and feature fusion on a plurality of high-level fusion feature vectors to correspondingly obtain a plurality of decision-level fusion feature vectors, the second multi-layer perceptron is used for carrying out rebusding on the plurality of decision-level fusion feature vectors to obtain a rebusding feature vector, and the second Softmax classifier is used for classifying the rebusding feature vector to obtain a final decision corresponding to each feature in the rebusding feature vector;

S52, calculating the decision probability of the final decision according to a normalization method, and selecting the final decision with the maximum decision probability value as a cloud decision.

7. The method for processing cloud-edge data fusion based on deep learning as claimed in claim 6, wherein in the step S51, the transformer network is configured to perform co-dimensional coding and feature fusion on a plurality of the advanced fusion feature vectors, so as to obtain a plurality of decision-level fusion feature vectors correspondingly, and the specific formula is as follows:

in the formula ,R_i Representing decision-level fusion eigenvectors corresponding to the ith homologous heterogeneous data, F _oi '、F _pi '、F _qi ' represents the same-dimensional coded high-level fusion feature vector corresponding to the ith homologous heterogeneous data, reLU represents an activation function, d represents feature dimension, W ₁ 、W ₂ Representing a learnable feature mapping matrix, b ₁ 、b ₂ Representing a learnable offset matrix.

8. The method for processing cloud-edge data fusion based on deep learning according to claim 1, wherein the edge computing platform model comprises a plurality of computing node models arranged in parallel, the edge computing platform comprises a plurality of computing nodes arranged in parallel, each computing node comprises a plurality of computing devices, each computing node corresponds to one computing node model, the cloud-edge data fusion model is trained by adopting a plurality of homologous heterogeneous data as training sets in S3, and in particular, the training mode of data parallelism is adopted for training the edge computing platform model in the cloud-edge data fusion model, and the method specifically comprises the following steps:

S31, storing various homologous heterogeneous data in the training set in a plurality of computing node models in the edge computing platform model, wherein each computing node model stores one type of homologous heterogeneous data;

s32, training by adopting various homologous heterogeneous data in the training set aiming at each computing node model in the edge computing platform model: each computing node model carries out model training by using the stored homologous heterogeneous data, and when the precision of each computing node model is not improved any more, the parameter updating of all the computing node models is stopped;

s33, training a plurality of computing devices in each computing node corresponding to each computing node model by using homologous heterogeneous data stored in each computing node model: and uniformly distributing the homologous heterogeneous data to each computing device by using a distributed data parallel technology, loading a computing node model corresponding to the computing node on each computing device, synchronously and parallelly updating the computing node model parameters on each computing device by adopting a parameter sharing mode, ensuring the consistency of the computing node models on each computing device, and stopping the training of the current computing node model when the precision of the computing node model on each computing device is not improved.

9. The method for processing the cloud-edge data fusion based on deep learning according to claim 8, wherein the training of the cloud-edge data fusion model by using a plurality of homologous heterogeneous data as training sets in the step S3 specifically comprises the steps of:

s34, forward reasoning of a distributed model of the cloud side data fusion model: taking various homologous heterogeneous data in the training set as the input of each calculation node model in the edge calculation platform model, obtaining a calculation reasoning result of the edge calculation platform model when the precision of the edge calculation platform model is not changed any more through calculation reasoning, taking the calculation reasoning result of the edge calculation platform model as the input of the cloud calculation center model, outputting the calculation reasoning result of the cloud calculation center model when the precision of the cloud calculation center model is not changed any more through calculation reasoning, and ending the forward reasoning process of the cloud edge data fusion model when the calculation reasoning of the cloud calculation center model is ended, wherein the calculation reasoning result of the cloud calculation center model is the calculation reasoning result of the cloud edge data fusion model;

S35, updating distributed model parameters of the cloud side data fusion model: and on the cloud computing center model, calculating decision accuracy by using a calculation reasoning result of the cloud computing center model, calculating an accuracy variation, judging whether the accuracy of the cloud computing center model is increased, if so, calculating the loss of the cloud computing center model, reversely calculating parameter gradients and updating model cloud computing center model parameters, when all the parameters of the cloud computing center model are updated, and when the accuracy of the cloud computing center model is not increased any more, transmitting the parameter gradients to the edge computing platform model, calculating the parameter gradients and updating the parameters of each computing node model on each computing node model in the edge computing platform model, and when the parameter updating of all the computing node models in the edge computing platform model is ended, namely, the parameter updating of the cloud edge data fusion model is ended, and the training of the cloud edge data fusion model is ended.

10. The cloud edge data fusion processing system based on deep learning, which is characterized by comprising a plurality of terminal devices, an edge computing platform model, a cloud computing center and a cloud computing center model, wherein the edge computing platform model is arranged on the edge computing platform, the cloud computing center model is arranged on the cloud computing center, a plurality of terminal devices are connected with the edge computing platform, the edge computing platform is connected with the cloud computing center, and the cloud computing center is connected with a plurality of terminal devices, wherein:

The terminal devices are used for acquiring multi-source heterogeneous data, and processing the multi-source heterogeneous data to obtain various homologous heterogeneous data;

the edge computing platform is used for receiving various homologous heterogeneous data, processing the various homologous heterogeneous data through the edge computing platform model to obtain a plurality of advanced fusion characteristics and a plurality of corresponding edge decisions, feeding the edge decisions back to the terminal devices, controlling the terminal devices to acquire the homologous heterogeneous data, and feeding the edge decisions back to the cloud computing center;

the cloud computing center performs feature fusion on the plurality of advanced fusion features through the cloud computing center model to obtain decision-level fusion features and cloud decisions; the cloud computing center receives a plurality of feedforward edge decisions to control cloud decisions of the cloud computing center; the cloud computing center feeds back the cloud decision to a plurality of terminal devices, controls the acquisition of multi-source heterogeneous data by the plurality of terminal devices, feeds back the cloud decision to the edge computing platform, and controls the edge decision of the edge computing platform.