CN117556187B - Cloud data restoration method and system based on deep learning and readable storage medium - Google Patents

Cloud data restoration method and system based on deep learning and readable storage medium Download PDF

Info

Publication number
CN117556187B
CN117556187B CN202311496926.XA CN202311496926A CN117556187B CN 117556187 B CN117556187 B CN 117556187B CN 202311496926 A CN202311496926 A CN 202311496926A CN 117556187 B CN117556187 B CN 117556187B
Authority
CN
China
Prior art keywords
cloud data
repair
deep learning
repaired
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311496926.XA
Other languages
Chinese (zh)
Other versions
CN117556187A (en
Inventor
康波峰
周烈华
彭忠
周继中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weichuang Software Wuhan Co ltd
Original Assignee
Weichuang Software Wuhan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weichuang Software Wuhan Co ltd filed Critical Weichuang Software Wuhan Co ltd
Priority to CN202311496926.XA priority Critical patent/CN117556187B/en
Publication of CN117556187A publication Critical patent/CN117556187A/en
Application granted granted Critical
Publication of CN117556187B publication Critical patent/CN117556187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a cloud data restoration method, a cloud data restoration system and a readable storage medium based on deep learning, wherein the cloud data restoration method comprises the following steps: when the repair trigger instruction is monitored, acquiring cloud data to be repaired corresponding to the repair trigger instruction, and determining a repair type corresponding to the cloud data to be repaired; if the cloud data to be repaired is of the time sequence type, repairing the cloud data to be repaired based on the generated repair factor matrix corresponding to the cloud data to be repaired and a first type deep learning model corresponding to the time sequence type; and if the cloud data is of the non-time sequence type, repairing the cloud data based on the searched repair reference text corresponding to the cloud data to be repaired and a second class deep learning model corresponding to the non-time sequence type. According to the method, different repair mechanisms are respectively set for the cloud data to be repaired of the time sequence type and the non-time sequence type, and the cloud data to be repaired are repaired by combining the information embodying the characteristics of the cloud data through the respective deep learning models, so that accurate repair of the time sequence data and the non-time sequence data in the cloud data is realized.

Description

Cloud data restoration method and system based on deep learning and readable storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to a cloud data restoration method and system based on deep learning and a readable storage medium.
Background
With the development of technology, networks have covered various industries, such as the internet of things, the internet of vehicles, smart homes, the industrial internet, etc., and such networks generate a large amount of data such as communication and monitoring over time, and such data is also commonly referred to as time series data due to the time characteristic. The monitoring data can be used as prediction data to predict the trend or performance of subsequent data, such as electric meter data of the internet of things, and can be used for predicting the electric quantity and the performance of the electric meter in a subsequent period of time, and the like, and the data of the internet of vehicles can be used for predicting the congestion condition of each lane, the performance of the province of the vehicle, and the like. In some cases, there may be anomalies in the monitored data, such as data values that are much greater or less than those monitored at ordinary times, or data missing, which would affect the implementation of the predictive function.
In addition, to facilitate storage and management of massive time series data, a data owner typically chooses to transmit the generated time series data to itself or a third party network cloud disk to form cloud data. In addition, some non-time sequence data related to the time sequence data are inevitably involved in the cloud data besides the time sequence data, and the non-time sequence data can be abnormal to influence the accurate storage of the time sequence data, so that more abnormality of the time sequence data is caused. Therefore, in order to achieve accurate storage of time series data and accurate prediction using time series data, it is necessary to repair data in which abnormality occurs in cloud data.
Disclosure of Invention
The invention mainly aims to provide a cloud data restoration method and system based on deep learning and a readable storage medium, and aims to solve the technical problem of how to restore abnormal data in cloud data.
In order to achieve the above object, the present invention provides a cloud data restoration method based on deep learning, the cloud data restoration method based on deep learning comprising:
when a repair trigger instruction is monitored, acquiring cloud data to be repaired corresponding to the repair trigger instruction, and determining a repair type corresponding to the cloud data to be repaired;
If the repair type is a time sequence type, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type;
If the repair type is a non-time sequence type, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-class deep learning model corresponding to the non-time sequence type.
Optionally, the step of generating a repair factor matrix corresponding to the cloud data to be repaired includes:
Determining influence factors of the cloud data to be repaired, and determining reference equipment information corresponding to the cloud data to be repaired according to the influence factors, wherein the influence factors at least comprise the model number, service life, maintenance record and environmental factors of detection equipment corresponding to the cloud data to be repaired;
And acquiring adjacent data corresponding to the cloud data to be repaired and reference adjacent data corresponding to each piece of reference equipment information respectively, and forming the adjacent data and each piece of reference adjacent data into the repair factor matrix.
Optionally, the step of repairing the cloud data to be repaired based on the repair factor matrix and the first class deep learning model corresponding to the time sequence type includes:
determining a plurality of similarity matrixes corresponding to the repair factor matrixes from training sample matrixes of the first-type deep learning model;
Transmitting the repair factor matrix and each similarity matrix to the first type of deep learning model, and calculating based on the weight matrix and the activation function of the first type of deep learning model to obtain a first calculation result, wherein the activation function is as follows:
F(x)=max(W*Tn/Wi+(Cn 0((n-1)2/n)*Tn));
F (x) is a first calculation result, max represents a maximum value, W is a repair factor matrix, tn is a weight matrix, wi is an ith similarity matrix, C j is a layer parameter of a jth layer in the first type deep learning model, and n is the layer number of the first type deep learning model;
and repairing the cloud data to be repaired based on the first calculation result.
Optionally, the step of searching the repair reference text corresponding to the cloud data to be repaired includes:
Determining a time coefficient corresponding to the cloud data to be repaired, and searching a reference file corresponding to the time coefficient;
and screening the repair reference text from each reference file based on the data type of the cloud data to be repaired.
Optionally, the step of repairing the cloud data to be repaired based on the repair reference text and the second class deep learning model corresponding to the non-time sequence type includes:
Word segmentation processing is carried out on the repair reference text based on the second class deep learning model, a plurality of text words are obtained, pixel feature recognition is carried out on each text word, and pixel feature coordinates of each text word are determined;
obtaining reference feature coordinates of reference information corresponding to the cloud data to be repaired, and performing similarity calculation on each pixel feature coordinate and the reference feature coordinates based on the second class of depth learning models to obtain a second calculation result, wherein a similarity calculation formula is as follows:
W(x,y)=g(x)||fmb a||pa(W1(x1m,y1m)-W2(x2m,y2m));
Wherein, W (x, y) is a second calculation result, g (x) is a sorting function, f mb a is a random number extraction function, for extracting random m, b is the number of extracted random numbers, a is the number of times of extracting random numbers, W1 (x 1m, y1 m) is feature coordinates of each pixel, W2 (x 2m, y2 m) is each reference feature coordinate, and p a (W1 (x 1m, y1 m) -W2 (x 2m, y2 m)) is a coordinate similarity calculation function;
and repairing the cloud data to be repaired based on the second calculation result.
Optionally, the step of repairing the cloud data to be repaired based on the second calculation result includes:
Searching target second calculation results matched with a preset threshold value in the second calculation results, and determining target text words for generating the target second calculation results;
acquiring text words to be recognized, which are adjacent to the target text words, in the repair reference text, and performing recognition calculation on each text word to be recognized based on the second class deep learning model to obtain a recognition result to repair the cloud data to be repaired, wherein the recognition calculation formula is as follows:
G=exp(kT t)yi2/exp(kt 0hi+bt 0yi2)
G is a recognition result, k T t is a feature vector of the t text word to be recognized, y i2 is a feature tag of the second class of deep learning models, k t 0 is a feature vector of the second class of deep learning models, hi is a model loss correction coefficient, and b t 0 is a model history recognition correction coefficient.
Optionally, the step of repairing the cloud data to be repaired based on the repair reference text and the second class deep learning model corresponding to the non-time sequence type includes:
receiving sample data and dividing the sample data into training sample data and test sample data;
Transmitting the training sample data to a preset deep learning model for training, calculating a loss function value of the preset deep learning model, and judging whether the loss function value is smaller than a preset function threshold, wherein the loss function value comprises a word segmentation loss function value, a pixel characteristic loss function value and a text recognition loss function value, and the calculation formula is as follows:
wherein L is the loss function value, The method comprises the steps of calculating a word segmentation loss function value, wherein k1 is the number of layers of a preset deep learning model, k2 is the number of training sample data, u ti is the word segmentation characteristic obtained through training, v i is the word segmentation label characteristic obtained through training, n i is a parameter of the preset deep learning model, u t0 is the word segmentation characteristic of the training sample data, and v 0 is the word segmentation label characteristic of the training sample data;
[ ≡ (e (xi-xj,(yi-yj))/(k2 |) ] is used to calculate the pixel feature loss function value, (xi, yi) is the pixel coordinates obtained by training, and (xj, yj) is the pixel coordinates of the training sample data;
for calculating text recognition loss function value,/> The method comprises the steps of calculating dimension reduction parameters of a preset deep learning model, wherein W T is a feature matrix of training sample data, X k is a feature value of kth training sample data, X T is a central mean value of the training sample data, wi is a text feature obtained through training, hi is a text label feature obtained through training, wj is a text feature of the training sample data, and hj is a text label feature of the training sample data;
If the loss function value is smaller than a preset function threshold, testing the preset deep learning network model based on the test sample data to obtain a test result, judging whether the test result meets a preset ending condition, and if so, generating the preset deep learning model into a second type deep learning model;
And if the loss function value is greater than or equal to a preset function threshold value or the test result does not meet a preset ending condition, adjusting model parameters of the preset deep learning model, performing iterative training on the preset deep learning model after model parameter adjustment based on the training sample data, and executing the step of calculating the loss function value of the preset deep learning model until the test result meets the preset ending condition.
Optionally, when the repair trigger instruction is detected, the step of acquiring cloud data to be repaired corresponding to the repair trigger instruction includes:
when the detection reaches a preset repair period, detecting whether abnormal data exist in each cloud data, and if so, generating the repair trigger instruction;
Or when abnormal data exists in the newly generated cloud data, generating the repair trigger instruction.
Further, in order to achieve the above object, the present invention also provides a cloud data repair system based on deep learning, the cloud data repair system based on deep learning comprising: memory, processor, communication bus, and control program stored on the memory:
The communication bus is used for realizing connection communication between the processor and the memory;
the processor is configured to execute the control program to implement the steps of the cloud data repair method based on deep learning as described above.
Further, in order to achieve the above object, the present invention also provides a readable storage medium having stored thereon a control program which, when executed by a processor, implements the steps of the deep learning-based cloud data restoration method as described above.
According to the cloud data restoration method, the cloud data restoration system and the readable storage medium based on deep learning, once a restoration trigger instruction is monitored, cloud data to be restored, which corresponds to the restoration trigger instruction, is acquired, and the corresponding restoration type is determined according to the type of the cloud data to be restored; if the repair type is determined to be the time sequence type, when the data needing to be repaired is the time sequence data, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type; if the repair type is determined to be the non-time sequence type, and the data needing to be repaired is the non-time sequence data, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-type deep learning model corresponding to the non-time sequence type. The repair factor matrix is formed by various characteristics of other detection equipment, and the other detection equipment and the detection equipment needing to repair time sequence data have extremely similar characteristics, so that the repair factor matrix can comprehensively embody the characteristics of cloud data to be repaired; the repair reference text is a text for initially forming non-time sequence data to be repaired, and reflects the provenance of the cloud data to be repaired. Therefore, according to the method and the device for repairing the cloud data, different repairing mechanisms are respectively set for the cloud data to be repaired of the time sequence type and the cloud data to be repaired of the non-time sequence type, different deep learning models are trained in advance for the characteristics of the cloud data to be repaired, the information showing the characteristics of the cloud data is combined through the deep learning models to repair, accurate repairing of the time sequence data and the non-time sequence data in the cloud data is achieved, and further accurate storage of the time sequence data and accurate prediction of the time sequence data are achieved.
Drawings
FIG. 1 is a schematic flow chart of a first embodiment of a cloud data restoration method based on deep learning;
FIG. 2 is a schematic flow chart of a second embodiment of a cloud data restoration method based on deep learning according to the present invention;
FIG. 3 is a schematic flow chart of a third embodiment of a cloud data restoration method based on deep learning according to the present invention;
FIG. 4 is a schematic flow chart of a fourth embodiment of a cloud data restoration method based on deep learning according to the present invention;
Fig. 5 is a schematic structural diagram of a hardware operating environment related to an embodiment scheme of a cloud data repair system based on deep learning.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a scheme of a first embodiment of a cloud data restoration method based on deep learning.
Embodiments of the present invention provide embodiments of a deep learning based cloud data repair method, it being noted that although a logical sequence is shown in the flow chart, in some cases, the steps shown or described may be performed in a different order than that shown or described herein. Specifically, the cloud data restoration method based on deep learning in the present embodiment includes:
Step S10, when a repair trigger instruction is monitored, acquiring cloud data to be repaired corresponding to the repair trigger instruction, and determining a repair type corresponding to the cloud data to be repaired;
The cloud data restoration method based on deep learning is applied to a cloud server, the cloud server supports communication connection with a plurality of terminals, the terminals can be sensors on networks such as the Internet of things, the Internet of vehicles or intelligent home, the sensors can directly transmit collected data to the cloud server, the sensors can also be terminal equipment on the networks such as the Internet of things, the Internet of vehicles or the intelligent home, the sensors transmit the collected data to the terminal equipment, and then the terminal equipment transmits the collected data to the cloud server. The data collected by the sensor usually has time sequence, the data collected at different time are different, the running state of the performance monitored by the sensor at different time is reflected, the data transmitted to the cloud server by the sensor form cloud data in the cloud server, the cloud data also has time sequence correspondingly, and the running state of the performance monitored by the sensor is predicted integrally through the past time sequence data.
It will be appreciated that in some cases, the performance monitored by the sensor itself is not abnormal, but the data collected by the sensor may be abnormal. For example, in the case of a sensor failure, the value of the acquired data is much larger or smaller than the value monitored at ordinary times, or the data is missing, no data is acquired, or the like. These anomalies can easily lead to errors in the prediction of subsequent operating conditions of the monitored performance. In addition, a large amount of cloud data stored in the cloud server includes, in addition to time-series data collected by each sensor, data of each sensor itself, such as non-time-series data having no time characteristics, such as a model number, a service life, and maintenance information of the sensor. Such non-time-sequential data may also be lost for some reasons, such as misdeletion, network attacks, etc. Such anomalies in non-time series data may cause anomalies in time series data storage, such as sensor model anomalies, and related detected data may not be associated with correct sensor storage, thereby resulting in errors in predicting subsequent operational states of the monitored performance of the sensor, or anomalies in monitoring the performance of the sensor itself due to anomalies in sensor age and overhaul information. For this reason, the present embodiment is provided with a repair mechanism for the above-described time-series cloud data and non-time-series cloud data. Specifically, when a repair trigger instruction is monitored, cloud data, which is characterized by the repair trigger instruction and needs to be repaired, namely cloud data to be repaired is acquired, and the data type of the cloud data to be repaired is determined according to the data identification of the cloud data to be repaired. If the data identification is a time sequence identification, indicating that the cloud data to be repaired is time sequence cloud data, wherein the data type is time sequence type; and if the data identification is the non-time sequence identification, indicating that the cloud data to be repaired is the non-time sequence cloud data, wherein the data type is the non-time sequence type. And determining a repair mechanism for repairing the cloud data to be repaired according to the data types so as to accurately and quickly repair the cloud data to be repaired of different types.
The repair trigger instruction may be triggered by a preset repair period, or may be triggered when abnormal data is detected, and specifically, before the step of acquiring cloud data to be repaired corresponding to the repair trigger instruction when the repair trigger instruction is detected, the method includes:
step a1, detecting whether abnormal data exist in each cloud data when the detection reaches a preset repair period, and generating the repair trigger instruction if the abnormal data exist;
or a step a2 of generating the repair trigger instruction when abnormal data exists in the newly generated cloud data.
Further, a preset repair period is preset according to the time of data collection and transmission of the sensor, cloud data stored in the cloud server are detected every time the preset repair period is detected, whether abnormal data exist in the cloud data is judged, and if the abnormal data exist, a repair trigger instruction is generated. Of course, in consideration of the fact that the data stored in the cloud server are numerous, and the numerous data further comprise data which is collected for a long time and is not frequently used, the repair objection of the data which is not frequently used is not great, so that a mechanism for detecting abnormal data of the data collected in a short period of time, for example, in the last year, two years, three years and the like, can be set. In addition, a real-time detection mechanism can be set, data acquired by a sensor and transmitted to the cloud server are detected, whether abnormal data exist in the data is judged, if abnormal data exist, the abnormal data exist in the newly generated cloud data, and then a repair trigger instruction is generated. Therefore, the repair mechanism is triggered by the repair trigger instruction to detect the data newly acquired by the sensor, determine the abnormal data existing in the data and repair the abnormal data, so that the accuracy of the cloud data is ensured.
It should be noted that, for the repair trigger instruction generated according to the detection mechanism, the normality of the monitored performance of the sensor needs to be ensured first, that is, after it is determined that abnormal data exists in the data collected by the sensor and transmitted to the cloud server, the performance parameter of the device from which the abnormal data is derived is obtained, whether the device is abnormal or not is determined according to the performance parameter, if no abnormality exists, it is indicated that the device from which the abnormal data is derived is not abnormal, and the data collected by the sensor is in error. For example, engine temperature data of a certain vehicle in the internet of vehicles is detected by a sensor, and the engine temperature data is not abnormal as it is known from various performance parameters of the engine itself. At the moment, a repair triggering instruction is triggered to repair the abnormal cloud data. Otherwise, if the equipment from which the abnormal data are judged to be abnormal through the performance parameters, the fact that the data acquired by the sensor are correct is indicated, and a repair trigger instruction is not required to be triggered at the moment, so that the cloud data which do not need to be repaired are prevented from being repaired, and cloud service resources are wasted.
And step S20, if the repair type is a time sequence type, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first-type deep learning model corresponding to the time sequence type.
Further, after determining that the repair type of the cloud data to be repaired is the time sequence type, the cloud data to be repaired can be repaired based on a repair mechanism corresponding to the time sequence cloud data. Specifically, a deep learning model for repairing time sequence cloud data is trained in advance, the deep learning model is used as a first type deep learning model corresponding to a time sequence type, data related to the cloud data to be repaired is searched, the related data are data detected by other sensors which are related to the type, service life, maintenance record, environment and the like of a sensor for collecting the cloud data to be repaired, the data are formed into a repair factor matrix corresponding to the cloud data to be repaired, the repair factor matrix is transmitted to the first type deep learning model, and the repair factor matrix is processed according to the first type deep learning model, so that the time sequence cloud data to be repaired is repaired.
Step S30, if the repair type is a non-time sequence type, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-type deep learning model corresponding to the non-time sequence type.
Further, if the repair type of the cloud data to be repaired is determined to be a non-time sequence type, searching a repair reference text related to the cloud data to be repaired, for example, if the cloud data to be repaired of the non-time sequence type is a sensor model, searching an article related to the sensor model in the cloud server as the repair reference text. In addition, a deep learning model for repairing non-time series cloud data is trained in advance in the cloud server, and the deep learning model is called as a second type deep learning model corresponding to the non-time series type. And transmitting the repair reference text to a second class of deep learning model, and processing the repair reference text according to the second class of deep learning model to realize the repair of the non-time sequence cloud data to be repaired.
According to the cloud data restoration method based on deep learning, once a restoration trigger instruction is monitored, cloud data to be restored, which corresponds to the restoration trigger instruction, is acquired, and the corresponding restoration type is determined according to the type of the cloud data to be restored; if the repair type is determined to be the time sequence type, when the data needing to be repaired is the time sequence data, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type; if the repair type is determined to be the non-time sequence type, and the data needing to be repaired is the non-time sequence data, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-type deep learning model corresponding to the non-time sequence type. The repair factor matrix is formed by various characteristics of other detection equipment, and the other detection equipment and the detection equipment needing to repair time sequence data have extremely similar characteristics, so that the repair factor matrix can comprehensively embody the characteristics of cloud data to be repaired; the repair reference text is a text for initially forming non-time sequence data to be repaired, and reflects the provenance of the cloud data to be repaired. Therefore, according to the method and the device for repairing the cloud data, different repairing mechanisms are respectively set for the cloud data to be repaired of the time sequence type and the cloud data to be repaired of the non-time sequence type, different deep learning models are trained in advance for the characteristics of the cloud data to be repaired, the information showing the characteristics of the cloud data is combined through the deep learning models to repair, accurate repairing of the time sequence data and the non-time sequence data in the cloud data is achieved, and further accurate storage of the time sequence data and accurate prediction of the time sequence data are achieved.
Further, referring to fig. 2, a second embodiment of the cloud data restoration method based on deep learning according to the present invention is provided based on the first embodiment of the cloud data restoration method based on deep learning according to the present invention.
The difference between the second embodiment of the cloud data repair method based on deep learning and the first embodiment of the cloud data repair method based on deep learning is that the step of generating the repair factor matrix corresponding to the cloud data to be repaired includes:
step S21, determining influence factors of the cloud data to be repaired, and determining reference equipment information corresponding to the cloud data to be repaired according to the influence factors, wherein the influence factors at least comprise model numbers, service life years, maintenance records and environmental factors of detection equipment corresponding to the cloud data to be repaired;
Step S22, acquiring proximity data corresponding to the cloud data to be repaired and reference proximity data corresponding to each piece of reference equipment information, and forming the proximity data and each piece of reference proximity data into the repair factor matrix.
Understandably, different time-sequential cloud data to be repaired are detected by different types of sensors, and the reference factors of the correlation data are also different. For cloud data to be repaired corresponding to the repair trigger instruction, determining influence factors influencing the cloud data to be repaired according to the equipment identifier carried by the cloud data to be repaired. The equipment identification characterizes a sensor from which cloud data to be repaired is sourced, and the model number, the service life, the maintenance record, the environmental factors of an installed area and the like of the sensor can be known through the sensor. The factors have influence on the accuracy, precision and the like of the data acquired by the sensor, so the factors are taken as factors having influence on the cloud data to be overhauled. And then searching other sensors with the same model, the same service life, similar maintenance records and similar installation area environmental factors according to the influence factors, and taking the other searched sensors as reference equipment information corresponding to cloud data to be overhauled.
Further, the data collected by the sensors are generally relatively continuous, no great difference exists between single data, and the data detected by the sensors with the same influence factors are similar, so that the abnormal data can be repaired based on the data. Specifically, the detection time point of cloud data to be repaired is taken as a basic time point, and data detected at several time points before and several time points after the basic time point are acquired as proximity data corresponding to the cloud data to be repaired. And simultaneously acquiring detection data of each reference device at a basic time point and a time point corresponding to each adjacent data respectively as reference adjacent information corresponding to each reference device information. And further, the acquired adjacent data and each reference adjacent data are formed into a repair factor matrix together so as to repair the cloud data to be repaired through the first-class deep learning model processing. Specifically, the step of repairing the cloud data to be repaired based on the repair factor matrix and the first class deep learning model corresponding to the time sequence type includes:
step S23, determining a plurality of similarity matrixes corresponding to the repair factor matrixes from training sample matrixes of the first-type deep learning model;
Step S24, transmitting the repair factor matrix and each similarity matrix to the first type of deep learning model, and calculating based on the weight matrix and the activation function of the first type of deep learning model to obtain a first calculation result;
And step S25, repairing the cloud data to be repaired based on the first calculation result.
It is understood that the first type of deep learning model is trained by a plurality of training sample matrices formed by sensor detection data, wherein the training sample matrices comprise matrices similar to the revision factor matrices, so that the similar matrices can be searched out from each training sample matrix to serve as similarity matrices corresponding to the repair factor matrices. And then transmitting the searched similarity matrixes and the repair factor matrixes to the first-class deep learning model, and processing and calculating the similarity matrixes and the repair factor matrixes through the weight matrixes and the activation functions of the first-class deep learning model to obtain a first calculation result, wherein the activation functions are shown in the following formula (1).
F(x)=max(W*Tn/Wi+(Cn 0((n-1)2/n)*Tn)) (1);
F (x) is a first calculation result, max represents a maximum value, W is a repair factor matrix, tn is a weight matrix, wi is an ith similarity matrix, C j is a layer parameter of a jth layer in the first type deep learning model, and n is the layer number of the first type deep learning model. The first calculation result obtained by the calculation of the activation function shows the repair result of the cloud data to be repaired, and the cloud data to be repaired can be repaired by replacing the cloud data to be repaired with the first calculation result.
In this embodiment, the repair factor matrix includes both the proximity data detected by the sensor from which the cloud data to be repaired is derived and the reference proximity data detected by the sensor with strong correlation, so that the repair factor matrix can more accurately reflect the characteristics of the cloud data to be repaired, and the similarity matrix found according to the repair factor matrix can also better reflect the characteristics of the cloud data to be repaired.
Further, referring to fig. 3, a third embodiment of the cloud data restoration method based on deep learning according to the present invention is provided based on the first and second embodiments of the cloud data restoration method based on deep learning according to the present invention.
The difference between the third embodiment of the cloud data repair method based on deep learning and the first and second embodiments of the cloud data repair method based on deep learning is that the step of searching the repair reference text corresponding to the cloud data to be repaired includes:
step S31, determining a time coefficient corresponding to the cloud data to be repaired, and searching a reference file corresponding to the time coefficient;
and step S32, screening the repair reference text from the reference files based on the data type of the cloud data to be repaired.
It can be appreciated that, for cloud data to be repaired with non-time sequence characteristics, more types of the cloud data are related to the model number, service life, maintenance record and the like of the sensor, and for this purpose, the data processing capacity of the cloud server can be reduced according to the earliest time record of the sensor from which the cloud data to be repaired is derived in the cloud server. Specifically, the earliest time record is different according to different specific types of cloud data to be repaired, if the specific types are types and service lives, the earliest time record can be determined according to the earliest detection record corresponding to the cloud data to be repaired, if the specific types are maintenance records, the earliest time record can be determined according to the time of the last maintenance record of the cloud data to be repaired, and meanwhile, the time period is determined according to the time of the next maintenance record.
Further, the determined earliest time record is used as a time coefficient corresponding to cloud data to be repaired, and reference files in the cloud server are searched based on the earliest time record, wherein the searched reference files are files in the cloud server within the time coefficient range, and the files comprise but are not limited to sensor use specifications, sensor introduction files, overhaul description files, error description files and the like. In addition, in order to further reduce the data processing amount, the repair reference text can be screened from all the reference files according to the data type of the cloud data to be repaired. The repair reference text may be some document in each reference file, or some part in some document in each reference file. For example, if the data type of the cloud data to be repaired is model number or service life, the sensor using description, the sensor introduction file and the like are selected from the reference files to serve as repair reference texts, so that the model number or service life is determined from recorded contents of the second-class deep learning model, and the repair of the cloud data to be repaired is realized. If the data type of the cloud data to be repaired is a repair record, firstly searching the content representing the sensor uniqueness such as the model number, the serial number and the like of the sensor from which the cloud data to be repaired is obtained, then screening out a repair description file and an error description file from each reference file, and further screening out the content containing the information such as the model number, the serial number and the like of the sensor from the searched files to serve as a repair reference text, so that the repair record is rapidly determined from the recorded content through a second-type deep learning model, and the repair of the cloud data to be repaired is realized.
Specifically, the step of repairing the cloud data to be repaired based on the repair reference text and the second class deep learning model corresponding to the non-time sequence type includes:
Step S33, word segmentation processing is carried out on the repair reference text based on the second class deep learning model, a plurality of text words are obtained, pixel feature recognition is carried out on each text word, and pixel feature coordinates of each text word are determined;
Step S34, obtaining reference feature coordinates of reference information corresponding to the cloud data to be repaired, and carrying out similarity calculation on each pixel feature coordinate and the reference feature coordinates based on the second class depth learning model to obtain a second calculation result;
And step S35, repairing the cloud data to be repaired based on the second calculation result.
Further, the second-class deep learning model can recognize texts through a large number of sample training, the searched repair reference texts are transmitted to the second-class deep learning model, word segmentation processing is carried out on the repair reference texts through the second-class deep learning model, a plurality of text words are obtained, wherein the word segmentation is carried out according to semantic logic relations, and the meaningless auxiliary words, connecting words and the like such as 'ground', 'top', 'bottom', 'top down' obtained by division are removed, and the rest words are formed into text words. And then, the second-class deep learning model establishes respective two-dimensional coordinate systems for all the text words, and identifies the pixel characteristics of all the text words to obtain the pixel characteristic coordinates of all the text words in the two-dimensional coordinate systems. Wherein the pixel feature recognition may be performed in a manner that recognizes white pixels and non-white pixels, for which coordinates are not recorded, and for which coordinates are recorded. And after all pixels of each text word are identified, obtaining pixel characteristic coordinates of each text word, and reflecting coordinate values of all pixels in each text word.
Further, information related to the repair of the cloud data to be repaired is used as reference information corresponding to the cloud data to be repaired, for example, the name, serial number and the like of the cloud data to be repaired. And establishing respective two-dimensional coordinate systems for the reference information, and carrying out pixel feature recognition on the two-dimensional coordinate systems to obtain each pixel coordinate as a reference feature coordinate so as to embody coordinate values of each pixel in the reference information.
And further, transmitting the feature coordinates of each pixel and the feature coordinates of each reference to a second class of depth learning model, and performing similarity calculation through the second class of depth learning model to obtain a second calculation result showing the similarity degree between the feature coordinates of each pixel and the feature coordinates of each reference. The similarity calculation formula is specifically the following formula (2):
W(x,y)=g(x)||fmb a||pa(W1(x1m,y1m)-W2(x2m,y2m)) (2);
Wherein W (x, y) is a second calculation result, g (x) is a sorting function, f mb a is a random number extraction function for extracting random m, b is the number of extracted random numbers, a is the number of times of extracting random numbers, (W1 (x 1m, y1 m) is the feature coordinates W2 (x 2m, y2 m) of each pixel is each reference feature coordinate, and p a (W1 (x 1m, y1 m) -W2 (x 2m, y2 m)) is a coordinate similarity calculation function.
Because of the large data size of the pixel feature coordinates and the reference feature coordinates, if the calculation is performed for each coordinate, the calculation efficiency is relatively low, and more resources may be occupied. For this purpose, the sampling calculation can be realized by the formula (2). Namely, by the g (x) sorting function in the formula (2), the respective pixel feature coordinates are sorted in order of the x-coordinate values from small to large, and the respective reference feature coordinates are sorted in order of the x-coordinate values from small to large. And extracting the random number m through the f mb a random number extraction function in the formula (2), and simultaneously setting the random number extraction number parameter b and the random number extraction frequency parameter a in the random number extraction function, wherein the number of the random numbers m extracted each time is b, and extracting a times. For each random number extracted each time, the number of coordinates needing similarity calculation in each pixel feature coordinate and reference feature coordinate is represented, for example, b=50, m=5, 2, 7, 8, 4 … …, and for the pixel feature coordinate and reference feature coordinate after sorting, the 5 th, 2 nd, 7 th, 8 th and 4 th … … coordinate values in each are selected and transmitted to a coordinate similarity calculation function p a (W1 (x 1m, y1 m) -W2 (x 2m, y2 m)) for calculation. And then extracting and calculating the random number for the second time until the number of times of extraction and calculation reaches a number of times, and obtaining a second calculation result for repairing the cloud data to be repaired. The step of repairing the cloud data to be repaired based on the second calculation result comprises the following steps:
Step S351, searching a target second calculation result matched with a preset threshold value in each second calculation result, and determining a target text word for generating the target second calculation result;
Step S352, obtaining text words to be recognized adjacent to the target text word in the repair reference text, and performing recognition calculation on each text word to be recognized based on the second class deep learning model to obtain a recognition result and repair the cloud data to be repaired.
Further, the second calculation result is the similarity between the pixel feature coordinates and the reference feature coordinates, and the similarity between the segmentation words divided by the reference information and the text words divided by the repair reference text is reflected, so that the text words in the repair reference text are searched for and the cloud data to be repaired is repaired through the similarity. Specifically, in order to determine the similarity between the segmentation words in the reference information and the text words in the repair reference text, a preset threshold value is preset, the calculated second calculation result is compared with the preset threshold value, whether the second calculation result is larger than the preset threshold value is judged, if so, the fact that the similarity degree between the segmentation words for producing the second calculation result and the text words is higher is indicated, otherwise, the similarity degree is low. And comparing and searching the second calculation results which are larger than the preset threshold value in the second calculation results, and taking the second calculation results as target second calculation results matched with the preset threshold value. Each second calculation result is calculated by a word segmentation and text words, and the text words generating the target second calculation result are searched to serve as target text words.
Furthermore, the target text word has higher similarity with the word segmentation in the reference information, and the word to be repaired for repairing cloud data may exist in several words adjacent to the target text word in the repair reference text. For example, the information of the repair reference text is "WPU1029 which is the serial number of the sensor with the model number XXXX", the cloud data to be repaired is the serial number, and the second calculation result is generated by the word model number "XXXX" of the target text, so that the cloud data to be repaired can be repaired by the adjacent word "WPU 1029". Specifically, after determining a target text word, acquiring words adjacent to the target text word in the repair reference text as text words to be recognized, transmitting the text words to be recognized to a second type deep learning model, and performing recognition calculation by the second type deep learning model to obtain a recognition result. The formula of the identification calculation is specifically shown in the following formula (3):
G=exp(kT t)yi2/exp(kt 0hi+bt 0yi2) (3);
G is a recognition result, k T t is a feature vector of the t text word to be recognized, y i2 is a feature tag of the second class of deep learning models, k t 0 is a feature vector of the second class of deep learning models, hi is a model loss correction coefficient, and b t 0 is a model history recognition correction coefficient.
The second class deep learning model has the function of accurately recognizing text words through a large number of sample training. During recognition calculation, the recognition calculation is carried out on each text word to be recognized by combining the characteristic label, the characteristic vector and the mode loss correction coefficient formed during training and the model history recognition correction coefficient formed according to the error of the second-class deep learning model history recognition calculation, so that the word corresponding to the recognition result is the word for repairing the cloud data to be repaired, and the word corresponding to the recognition result can be used for replacing the cloud data to be repaired to repair the cloud data to be repaired.
The source formation text, the initial description text or the historical text of the cloud data to be repaired in the repair reference text system can embody various information related to the cloud data to be repaired, including the cloud data to be repaired, and accurate target text words can be obtained through similarity calculation of coordinates of the cloud data to be repaired and the reference information of the cloud data to be repaired, so that the accurate target text words accurately obtain the text words to be recognized, and further accurate repair of the cloud data to be repaired is achieved through recognition results of the text words to be recognized. In addition, by setting the similarity calculation of the coordinates as sampling calculation, the random number of sampling and the sampling times can be set, so that the calculation accuracy is ensured, the calculation efficiency is improved, and the improvement of the cloud data restoration efficiency to be restored is further promoted.
Further, referring to fig. 4, a fourth embodiment of the cloud data restoration method based on deep learning according to the present invention is provided based on the first, second and third embodiments of the cloud data restoration method based on deep learning according to the present invention.
The difference between the fourth embodiment of the cloud data repair method based on deep learning and the first, second and third embodiments of the cloud data repair method based on deep learning is that, before the step of repairing the cloud data to be repaired, the method based on the repair reference text and the second class of deep learning model corresponding to the non-time sequence type includes:
Step S40, receiving sample data and dividing the sample data into training sample data and test sample data;
Step S50, transmitting the training sample data to a preset deep learning model for training, calculating a loss function value of the preset deep learning model, and judging whether the loss function value is smaller than a preset function threshold, wherein the loss function value comprises a word segmentation loss function value, a pixel characteristic loss function value and a text recognition loss function value;
Step S60, if the loss function value is smaller than a preset function threshold, testing the preset deep learning network model based on the test sample data to obtain a test result, judging whether the test result meets a preset ending condition, and if so, generating the preset deep learning model into a second type deep learning model;
And step S70, if the loss function value is greater than or equal to a preset function threshold value or the test result does not meet a preset end condition, adjusting model parameters of the preset deep learning model, performing iterative training on the preset deep learning model after model parameters are adjusted based on the training sample data, and executing the step of calculating the loss function value of the preset deep learning model until the test result meets the preset end condition.
Understandably, the second type of deep learning model needs to be trained in advance before the cloud data to be repaired is repaired by the second type of deep learning model. Specifically, the cloud server receives a large amount of sample data for training, and then divides the large amount of sample data into training sample data and test sample data according to a certain proportion. The training sample data is sample data for model training, the test sample data is data for testing whether model training is completed, and for accuracy of training, the proportion of the training sample data is generally divided into a proportion larger than that of the test sample data, for example, the two are divided according to a proportion of 7 to 3, or the two are divided according to a proportion of 8 to 2, etc.
Further, a preset deep learning model for training is preset, and the divided training sample data is transmitted to the preset deep learning model for training. The preset deep learning model is provided with a loss function and a preset function threshold value for representing the quality of the training result. After each training is finished, the loss function is calculated, the loss function value is compared with a preset function threshold value, and whether the loss function value is smaller than the preset function threshold value is judged. If the training result is smaller than the preset deep learning model, the training performance of the preset deep learning model is indicated to reach the required performance, and then test sample data are transmitted to the preset deep learning model to be tested, and a test result is obtained. The test result shows the accuracy of the trained preset deep learning model on the test sample data processing, and preset ending conditions are preset for showing the quality of the test result, for example, the accuracy of the processing result reaches 98%. Comparing the obtained test result with the preset ending condition, judging whether the test result meets the preset ending condition, and if so, indicating that the data processing accuracy of the trained preset deep learning model is high, so that the trained preset deep learning model is generated into a second type deep learning model.
It should be noted that, since the second type of deep learning model at least needs to perform word segmentation, pixel feature coordinate similarity calculation, text recognition calculation, and the like, the model accuracy is at least related to the three parts, and thus the corresponding loss function is also related to the three parts. Specifically, the calculation formula of the loss function value can be seen in the following formula (4):
wherein L is the loss function value, Is used for calculating word segmentation loss function values, k1 is the number of layers of a preset deep learning model, k2 is the number of training sample data, u ti is the word segmentation characteristics obtained through training, v i is the word segmentation label characteristics obtained through training, n i is the parameters of the preset deep learning model, u t0 is the word segmentation characteristics of the training sample data, and v 0 is the word segmentation label characteristics of the training sample data;
the [ ≡ (e (xi-xj,(yi-yj))/(k2 |) ] section is used to calculate the pixel feature loss function value, (xi, yi) is the pixel coordinates obtained by training, and (xj, yj) is the pixel coordinates of the training sample data;
is used to calculate a text recognition loss function value, and/> The method is used for calculating dimension reduction parameters of a preset deep learning model, W T is a feature matrix of training sample data, X k is a feature value of kth training sample data, X T is a central mean value of the training sample data, wi is a text feature obtained through training, hi is a text label feature obtained through training, wj is a text feature of the training sample data, and hj is a text label feature of the training sample data.
After training the preset deep learning model each time, calculating the word segmentation loss function value, the pixel characteristic loss function value and the text recognition loss function value through a preset loss function, and adding the three values to obtain the overall loss function value. And determining whether to test the preset deep learning network name model through test sample data according to the magnitude relation between the loss function value and the preset function threshold.
In addition, it should be further described that, for the word segmentation loss function value, the pixel feature loss function value, and the text loss function value, respective preset function thresholds may be set, for example, a first preset function threshold, a second preset function threshold, a third preset function threshold, and a total preset function threshold are preset, after the word segmentation loss function value, the pixel feature loss function value, and the text loss function value are calculated, the word segmentation loss function value is compared with the first preset function threshold, the pixel feature loss function value is compared with the second preset function threshold, and the text loss function value is compared with the third preset function threshold. After the word segmentation loss function value is determined to be smaller than a first preset function threshold value through comparison, the pixel characteristic loss function value is smaller than a second preset function threshold value, the text loss function value is smaller than a third preset function threshold value, the word segmentation loss function value, the pixel characteristic loss function value and the text loss function value are added to obtain a loss function value which is compared with a total preset function threshold value, and if the loss function value is smaller than the total preset function threshold value, the test sample data are tested on a preset deep learning network name model. If any one of the word segmentation loss function value, the pixel characteristic loss function value and the text loss function value is not smaller than the corresponding preset function threshold, the three are not subjected to summation processing, training is continued through training sample data, and the three are not subjected to summation until all the three are smaller than the corresponding preset function threshold, so that the loss function value is compared with the total preset function threshold. Therefore, the performance of the second class deep learning model in word segmentation processing, pixel feature coordinate similarity calculation and text recognition calculation is further improved.
Further, if the calculated loss function value is greater than or equal to the preset function threshold, the training of the preset network model is not up to the required performance, and the training still needs to be continued; or the result of the test sample data test fails to meet the preset ending condition, which indicates that the data processing accuracy of the preset network model after training is low, and the training still needs to be continued. At this time, the model parameters of the preset network model are adjusted according to preset adjustment rules, such as adjustment according to the rules of an arithmetic difference or an arithmetic sequence each time. And performing iterative training on the preset deep learning model with the model parameters adjusted by training sample data, calculating a loss function value again through the formula (4) after training is finished, and testing by using test sample data after the loss function value is smaller than a preset function threshold value until the obtained test result meets a preset finishing condition, so as to judge that the training of the preset deep learning model is finished and generate a second class deep learning model.
According to the embodiment, the functions of word segmentation processing, pixel feature coordinate similarity calculation, text recognition calculation and the like required by the second-class deep learning model are respectively considered, corresponding word segmentation loss function values, pixel feature loss function values and text recognition loss function values are set, and the loss function values formed by the three functions reflect the performance of the second-class deep learning model, so that the accuracy of the word segmentation processing, pixel feature coordinate similarity calculation and text recognition calculation of the second-class deep learning model is better. In addition, the method is used for iterative training of the second-class deep learning model through massive sample data, and in the training process, besides the training result is reflected by the loss function value, test sample data are set for testing, so that the data processing accuracy of the second-class deep learning model is further improved.
In addition, the embodiment of the invention also provides a cloud data restoration system based on deep learning. Referring to fig. 5, fig. 5 is a schematic structural diagram of a device hardware operating environment related to an embodiment scheme of a cloud data repair system based on deep learning according to the present invention.
As shown in fig. 5, the deep learning-based cloud data repair system may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the hardware structure of the deep learning based cloud data remediation system shown in fig. 5 does not constitute a limitation of the deep learning based cloud data remediation system, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
As shown in fig. 5, an operating system, a network communication module, a user interface module, and a control program may be included in the memory 1005 as one type of readable storage medium. The operating system is a program for managing and controlling the cloud data repair system and software resources based on deep learning, and supports the operation of a network communication module, a user interface module, a control program and other programs or software; the network communication module is used to manage and control the network interface 1004; the user interface module is used to manage and control the user interface 1003.
In the hardware structure of the cloud data repair system based on deep learning shown in fig. 5, the network interface 1004 is mainly used for connecting a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; the processor 1001 may call a control program stored in the memory 1005 and perform the following operations:
when a repair trigger instruction is monitored, acquiring cloud data to be repaired corresponding to the repair trigger instruction, and determining a repair type corresponding to the cloud data to be repaired;
If the repair type is a time sequence type, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type;
If the repair type is a non-time sequence type, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-class deep learning model corresponding to the non-time sequence type.
Further, the step of generating the repair factor matrix corresponding to the cloud data to be repaired includes:
Determining influence factors of the cloud data to be repaired, and determining reference equipment information corresponding to the cloud data to be repaired according to the influence factors, wherein the influence factors at least comprise the model number, service life, maintenance record and environmental factors of detection equipment corresponding to the cloud data to be repaired;
And acquiring adjacent data corresponding to the cloud data to be repaired and reference adjacent data corresponding to each piece of reference equipment information respectively, and forming the adjacent data and each piece of reference adjacent data into the repair factor matrix.
Further, the step of repairing the cloud data to be repaired based on the repair factor matrix and the first class deep learning model corresponding to the time sequence type includes:
determining a plurality of similarity matrixes corresponding to the repair factor matrixes from training sample matrixes of the first-type deep learning model;
Transmitting the repair factor matrix and each similarity matrix to the first type of deep learning model, and calculating based on the weight matrix and the activation function of the first type of deep learning model to obtain a first calculation result, wherein the activation function is as follows:
F(x)=max(W*Tn/Wi+(Cn 0((n-1)2/n)*Tn));
F (x) is a first calculation result, max represents a maximum value, W is a repair factor matrix, tn is a weight matrix, wi is an ith similarity matrix, C j is a layer parameter of a jth layer in the first type deep learning model, and n is the layer number of the first type deep learning model;
and repairing the cloud data to be repaired based on the first calculation result.
Further, the step of searching the repair reference text corresponding to the cloud data to be repaired includes:
Determining a time coefficient corresponding to the cloud data to be repaired, and searching a reference file corresponding to the time coefficient;
and screening the repair reference text from each reference file based on the data type of the cloud data to be repaired.
Further, the step of repairing the cloud data to be repaired based on the repair reference text and the second class deep learning model corresponding to the non-time sequence type includes:
Word segmentation processing is carried out on the repair reference text based on the second class deep learning model, a plurality of text words are obtained, pixel feature recognition is carried out on each text word, and pixel feature coordinates of each text word are determined;
obtaining reference feature coordinates of reference information corresponding to the cloud data to be repaired, and performing similarity calculation on each pixel feature coordinate and the reference feature coordinates based on the second class of depth learning models to obtain a second calculation result, wherein a similarity calculation formula is as follows:
W(x,y)=g(x)||fmb a||pa(W1(x1m,y1m)-W2(x2m,y2m));
Wherein, W (x, y) is a second calculation result, g (x) is a sorting function, f mb a is a random number extraction function, for extracting random m, b is the number of extracted random numbers, a is the number of times of extracting random numbers, W1 (x 1m, y1 m) is feature coordinates of each pixel, W2 (x 2m, y2 m) is each reference feature coordinate, and p a (W1 (x 1m, y1 m) -W2 (x 2m, y2 m)) is a coordinate similarity calculation function;
and repairing the cloud data to be repaired based on the second calculation result.
Further, the step of repairing the cloud data to be repaired based on the second calculation result includes:
Searching target second calculation results matched with a preset threshold value in the second calculation results, and determining target text words for generating the target second calculation results;
acquiring text words to be recognized, which are adjacent to the target text words, in the repair reference text, and performing recognition calculation on each text word to be recognized based on the second class deep learning model to obtain a recognition result to repair the cloud data to be repaired, wherein the recognition calculation formula is as follows:
G=exp(kT t)yi2/exp(kt 0hi+bt 0yi2)
G is a recognition result, k T t is a feature vector of the t text word to be recognized, y i2 is a feature tag of the second class of deep learning models, k t 0 is a feature vector of the second class of deep learning models, hi is a model loss correction coefficient, and b t 0 is a model history recognition correction coefficient.
Further, before the step of repairing the cloud data to be repaired based on the repair reference text and the second class deep learning model corresponding to the non-time sequence type, the processor 1001 may call a control program stored in the memory 1005, and perform the following operations:
receiving sample data and dividing the sample data into training sample data and test sample data;
Transmitting the training sample data to a preset deep learning model for training, calculating a loss function value of the preset deep learning model, and judging whether the loss function value is smaller than a preset function threshold, wherein the loss function value comprises a word segmentation loss function value, a pixel characteristic loss function value and a text recognition loss function value, and the calculation formula is as follows:
wherein L is the loss function value, The method comprises the steps of calculating a word segmentation loss function value, wherein k1 is the number of layers of a preset deep learning model, k2 is the number of training sample data, u ti is the word segmentation characteristic obtained through training, v i is the word segmentation label characteristic obtained through training, n i is a parameter of the preset deep learning model, u t0 is the word segmentation characteristic of the training sample data, and v 0 is the word segmentation label characteristic of the training sample data;
[ ≡ (e (xi-xj,(yi-yj))/(k2 |) ] is used to calculate the pixel feature loss function value, (xi, yi) is the pixel coordinates obtained by training, and (xj, yj) is the pixel coordinates of the training sample data;
for calculating text recognition loss function value,/> The method comprises the steps of calculating dimension reduction parameters of a preset deep learning model, wherein W T is a feature matrix of training sample data, X k is a feature value of kth training sample data, X T is a central mean value of the training sample data, wi is a text feature obtained through training, hi is a text label feature obtained through training, wj is a text feature of the training sample data, and hj is a text label feature of the training sample data;
If the loss function value is smaller than a preset function threshold, testing the preset deep learning network model based on the test sample data to obtain a test result, judging whether the test result meets a preset ending condition, and if so, generating the preset deep learning model into a second type deep learning model;
And if the loss function value is greater than or equal to a preset function threshold value or the test result does not meet a preset ending condition, adjusting model parameters of the preset deep learning model, performing iterative training on the preset deep learning model after model parameter adjustment based on the training sample data, and executing the step of calculating the loss function value of the preset deep learning model until the test result meets the preset ending condition.
Further, before the step of acquiring the cloud data to be repaired corresponding to the repair trigger instruction when the repair trigger instruction is detected, the processor 1001 may call a control program stored in the memory 1005, and perform the following operations:
when the detection reaches a preset repair period, detecting whether abnormal data exist in each cloud data, and if so, generating the repair trigger instruction;
Or when abnormal data exists in the newly generated cloud data, generating the repair trigger instruction.
The specific implementation manner of the cloud data restoration system based on the deep learning is basically the same as the above embodiments of the cloud data restoration method based on the deep learning, and is not repeated here.
The embodiment of the invention also provides a readable storage medium. The readable storage medium has stored thereon a control program which, when executed by a processor, implements the steps of the deep learning-based cloud data restoration method described above.
The specific implementation manner of the readable storage medium of the present invention may be substantially the same as the embodiments of the cloud data restoration method based on deep learning, and will not be described herein.
While the embodiments of the present invention have been described above with reference to the drawings, the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many modifications may be made thereto by those of ordinary skill in the art without departing from the spirit of the present invention and the scope of the appended claims, which are to be accorded the full scope of the present invention as defined by the following description and drawings, or by any equivalent structures or equivalent flow changes, or by direct or indirect application to other relevant technical fields.

Claims (8)

1. The cloud data restoration method based on deep learning is characterized by comprising the following steps of:
when a repair trigger instruction is monitored, acquiring cloud data to be repaired corresponding to the repair trigger instruction, and determining a repair type corresponding to the cloud data to be repaired;
If the repair type is a time sequence type, generating a repair factor matrix corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type;
If the repair type is a non-time sequence type, searching a repair reference text corresponding to the cloud data to be repaired, and repairing the cloud data to be repaired based on the repair reference text and a second-class deep learning model corresponding to the non-time sequence type;
the step of generating the repair factor matrix corresponding to the cloud data to be repaired comprises the following steps:
Determining influence factors of the cloud data to be repaired, and determining reference equipment information corresponding to the cloud data to be repaired according to the influence factors, wherein the influence factors at least comprise the model number, service life, maintenance record and environmental factors of detection equipment corresponding to the cloud data to be repaired;
Acquiring adjacent data corresponding to the cloud data to be repaired and reference adjacent data corresponding to each piece of reference equipment information respectively, and forming the adjacent data and each piece of reference adjacent data into the repair factor matrix;
the step of repairing the cloud data to be repaired based on the repair factor matrix and a first type deep learning model corresponding to the time sequence type comprises the following steps:
determining a plurality of similarity matrixes corresponding to the repair factor matrixes from training sample matrixes of the first-type deep learning model;
transmitting the repair factor matrix and each similarity matrix to the first type of deep learning model, and calculating to obtain a first calculation result based on the weight matrix and the activation function of the first type of deep learning model, wherein the activation function is as follows:
F(x)=max(W*Tn/Wi+(Cj((n-1)2/n)*Tn));
F (x) is a first calculation result, max represents a maximum value, W is a repair factor matrix, tn is a weight matrix, wi is an ith similarity matrix, C j is a layer parameter of a jth layer in the first type deep learning model, and n is the layer number of the first type deep learning model;
and repairing the cloud data to be repaired based on the first calculation result.
2. The cloud data repair method of claim 1, wherein the step of searching for repair reference text corresponding to the cloud data to be repaired comprises:
Determining a time coefficient corresponding to the cloud data to be repaired, and searching a reference file corresponding to the time coefficient;
and screening the repair reference text from each reference file based on the data type of the cloud data to be repaired.
3. The cloud data repair method according to claim 2, wherein the step of repairing the cloud data to be repaired based on the repair reference text and a second class deep learning model corresponding to the non-time series type includes:
Word segmentation processing is carried out on the repair reference text based on the second class deep learning model, a plurality of text words are obtained, pixel feature recognition is carried out on each text word, and pixel feature coordinates of each text word are determined;
obtaining reference feature coordinates of reference information corresponding to the cloud data to be repaired, and performing similarity calculation on each pixel feature coordinate and the reference feature coordinates based on the second class of depth learning models to obtain a second calculation result, wherein a similarity calculation formula is as follows:
W(x,y)=g(x)||fmb a||pa(W1(x1m,y1m)-W2(x2m,y2m));
Wherein W (x, y) is a second calculation result, g (x) is a sorting function, f mb a is a random number extraction function, for extracting a random number m, b is the number of extracted random numbers, a is the number of times of extracting the random number, W1 (x 1m, y1 m) is feature coordinates of each pixel, W2 (x 2m, y2 m) is each reference feature coordinate, and p a (W1 (x 1m, y1 m) -W2 (x 2m, y2 m)) is a coordinate similarity calculation function;
and repairing the cloud data to be repaired based on the second calculation result.
4. The cloud data repair method of claim 3, wherein the step of repairing the cloud data to be repaired based on the second calculation result includes:
Searching target second calculation results matched with a preset threshold value in the second calculation results, and determining target text words for generating the target second calculation results;
acquiring text words to be recognized, which are adjacent to the target text words, in the repair reference text, and performing recognition calculation on each text word to be recognized based on the second class deep learning model to obtain a recognition result to repair the cloud data to be repaired, wherein the recognition calculation formula is as follows:
G=exp(kT t)yi2/exp(kt 0hi+bt 0yi2)
G is a recognition result, k T t is a feature vector of the t text word to be recognized, y i2 is a feature tag of the second class of deep learning models, k t 0 is a feature vector of the second class of deep learning models, hi is a model loss correction coefficient, and b t 0 is a model history recognition correction coefficient.
5. The cloud data repair method according to any one of claims 1 to 4, wherein the step of repairing the cloud data to be repaired based on the repair reference text and a second type of deep learning model corresponding to the non-time series type includes, before:
receiving sample data and dividing the sample data into training sample data and test sample data;
Transmitting the training sample data to a preset deep learning model for training, calculating a loss function value of the preset deep learning model, and judging whether the loss function value is smaller than a preset function threshold, wherein the loss function value comprises a word segmentation loss function value, a pixel characteristic loss function value and a text recognition loss function value, and the calculation formula is as follows:
wherein L is the loss function value, The method comprises the steps of calculating a word segmentation loss function value, wherein k1 is the number of layers of a preset deep learning model, k2 is the number of training sample data, u ti is the word segmentation characteristic obtained through training, v i is the word segmentation label characteristic obtained through training, n i is a parameter of the preset deep learning model, u t0 is the word segmentation characteristic of the training sample data, and v 0 is the word segmentation label characteristic of the training sample data;
[ ≡ (e (xi-xj,(yi-yj))/(k2 |) ] is used to calculate the pixel feature loss function value, (xi, yi) is the pixel coordinates obtained by training, and (xj, yj) is the pixel coordinates of the training sample data;
for calculating text recognition loss function value,/> The method comprises the steps of calculating dimension reduction parameters of a preset deep learning model, wherein W T is a feature matrix of training sample data, X k is a feature value of kth training sample data, X T is a central mean value of the training sample data, wi is a text feature obtained through training, hi is a text label feature obtained through training, wj is a text feature of the training sample data, and hj is a text label feature of the training sample data;
If the loss function value is smaller than a preset function threshold, testing the preset deep learning model based on the test sample data to obtain a test result, judging whether the test result meets a preset ending condition, and if so, generating the preset deep learning model into a second type deep learning model;
And if the loss function value is greater than or equal to a preset function threshold value or the test result does not meet a preset ending condition, adjusting model parameters of the preset deep learning model, performing iterative training on the preset deep learning model after model parameter adjustment based on the training sample data, and executing the step of calculating the loss function value of the preset deep learning model until the test result meets the preset ending condition.
6. The cloud data repair method according to any one of claims 1 to 4, wherein when a repair trigger instruction is monitored, the step of acquiring cloud data to be repaired corresponding to the repair trigger instruction includes, before:
when the detection reaches a preset repair period, detecting whether abnormal data exist in each cloud data, and if so, generating the repair trigger instruction;
Or when abnormal data exists in the newly generated cloud data, generating the repair trigger instruction.
7. A deep learning-based cloud data repair system, the deep learning-based cloud data repair system comprising: memory, processor, communication bus, and control program stored on the memory:
The communication bus is used for realizing connection communication between the processor and the memory;
the processor is configured to execute the control program to implement the steps of the deep learning-based cloud data repair method according to any one of claims 1 to 6.
8. A readable storage medium, wherein a control program is stored on the readable storage medium, and when executed by a processor, the control program implements the steps of the deep learning-based cloud data restoration method according to any one of claims 1 to 6.
CN202311496926.XA 2023-11-10 2023-11-10 Cloud data restoration method and system based on deep learning and readable storage medium Active CN117556187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311496926.XA CN117556187B (en) 2023-11-10 2023-11-10 Cloud data restoration method and system based on deep learning and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311496926.XA CN117556187B (en) 2023-11-10 2023-11-10 Cloud data restoration method and system based on deep learning and readable storage medium

Publications (2)

Publication Number Publication Date
CN117556187A CN117556187A (en) 2024-02-13
CN117556187B true CN117556187B (en) 2024-05-10

Family

ID=89813952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311496926.XA Active CN117556187B (en) 2023-11-10 2023-11-10 Cloud data restoration method and system based on deep learning and readable storage medium

Country Status (1)

Country Link
CN (1) CN117556187B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560476A (en) * 2020-12-09 2021-03-26 中科讯飞互联(北京)信息科技有限公司 Text completion method, electronic device and storage device
KR20210082103A (en) * 2019-12-24 2021-07-02 탱커주식회사 An apparatus and a method for calculating expected real estate transaction price based on real estate transaction price by using a machine learning model
CN113743297A (en) * 2021-09-03 2021-12-03 重庆大学 Storage tank dome displacement data restoration method and device based on deep learning
CN113988210A (en) * 2021-11-10 2022-01-28 长沙理工大学 Method and device for restoring distorted data of structure monitoring sensor network and storage medium
CN116991802A (en) * 2023-07-03 2023-11-03 深圳软牛科技有限公司 File repair method, device, terminal equipment and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210082103A (en) * 2019-12-24 2021-07-02 탱커주식회사 An apparatus and a method for calculating expected real estate transaction price based on real estate transaction price by using a machine learning model
CN112560476A (en) * 2020-12-09 2021-03-26 中科讯飞互联(北京)信息科技有限公司 Text completion method, electronic device and storage device
CN113743297A (en) * 2021-09-03 2021-12-03 重庆大学 Storage tank dome displacement data restoration method and device based on deep learning
CN113988210A (en) * 2021-11-10 2022-01-28 长沙理工大学 Method and device for restoring distorted data of structure monitoring sensor network and storage medium
CN116991802A (en) * 2023-07-03 2023-11-03 深圳软牛科技有限公司 File repair method, device, terminal equipment and readable storage medium

Also Published As

Publication number Publication date
CN117556187A (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN110471945B (en) Active data processing method, system, computer equipment and storage medium
CN110019349A (en) Sentence method for early warning, device, equipment and computer readable storage medium
CN112540905A (en) System risk assessment method, device, equipment and medium under micro-service architecture
CN116467674B (en) Intelligent fault processing fusion updating system and method for power distribution network
CN113868498A (en) Data storage method, electronic device, device and readable storage medium
CN112084180A (en) Method, device, equipment and medium for monitoring vehicle-mounted application quality
CN114881343A (en) Short-term load prediction method and device of power system based on feature selection
US20210390802A1 (en) Method, Computer Program And Device For Processing Signals
CN114491282A (en) Abnormal user behavior analysis method and system based on cloud computing
CN113326177A (en) Index anomaly detection method, device, equipment and storage medium
CN117556187B (en) Cloud data restoration method and system based on deep learning and readable storage medium
CN111459796B (en) Automated testing method, apparatus, computer device and storage medium
CN117235606A (en) Production quality management method and system for special stainless steel
CN115658620B (en) Data authorization sharing method and system based on big data
US20230156043A1 (en) System and method of supporting decision-making for security management
CN115168509A (en) Processing method and device of wind control data, storage medium and computer equipment
CN113038283A (en) Video recommendation method and device and storage medium
CN110968467A (en) Remote automatic test method for GPU and algorithm
CN115906170B (en) Security protection method and AI system applied to storage cluster
CN112950056A (en) Ecological environment intelligent monitoring analysis method based on big data and cloud platform system
CN116091553B (en) Track determination method, track determination device, electronic equipment, vehicle and storage medium
CN114358911B (en) Invoicing data risk control method and device, computer equipment and storage medium
CN114937316B (en) Software fault detection method, device, equipment and medium
US20230105304A1 (en) Proactive avoidance of performance issues in computing environments
CN118035527A (en) Interactive data processing method, medium and equipment for business and resource

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant