CN112506423B - Method and device for dynamically accessing storage equipment in cloud storage system - Google Patents

Method and device for dynamically accessing storage equipment in cloud storage system Download PDF

Info

Publication number
CN112506423B
CN112506423B CN202011204997.4A CN202011204997A CN112506423B CN 112506423 B CN112506423 B CN 112506423B CN 202011204997 A CN202011204997 A CN 202011204997A CN 112506423 B CN112506423 B CN 112506423B
Authority
CN
China
Prior art keywords
data
storage
storage device
stored
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011204997.4A
Other languages
Chinese (zh)
Other versions
CN112506423A (en
Inventor
李雨来
张天石
陈震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Speedycloud Technology Co ltd
Original Assignee
Beijing Speedycloud Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Speedycloud Technology Co ltd filed Critical Beijing Speedycloud Technology Co ltd
Priority to CN202011204997.4A priority Critical patent/CN112506423B/en
Publication of CN112506423A publication Critical patent/CN112506423A/en
Application granted granted Critical
Publication of CN112506423B publication Critical patent/CN112506423B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0635Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for dynamically accessing a storage device in a cloud storage system, belonging to the field of computer communication, wherein the method comprises the following steps: acquiring a characteristic data set corresponding to each storage device in a cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system; carrying out transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model; acquiring a data characteristic value of data to be stored, and generating a characteristic data set corresponding to the data to be stored; inputting a characteristic data set corresponding to data to be stored into the neural network model for calculation, and determining the storage equipment according to an output result; and dynamically accessing the data to be stored into the storage equipment for cloud storage. The invention can select the optimal storage device for the new data and dynamically access the new data to the optimal storage device.

Description

Method and device for dynamically accessing storage equipment in cloud storage system
Technical Field
The invention relates to the technical field of computer communication, in particular to a method and a device for dynamically accessing a storage device in a cloud storage system.
Background
Cloud storage is a new concept that has been extended and evolved over the cloud computing concept. Cloud Computing is a development of Distributed processing (Distributed Computing), Parallel processing (Parallel Computing) and Grid Computing (Grid Computing), and is a method of automatically splitting a huge Computing processing program into numerous smaller sub-programs through a network, and then returning a processing result to a user after Computing and analyzing the huge system composed of a plurality of servers.
The concept of cloud storage is similar to that of cloud computing, and refers to a system which integrates a large number of storage devices of different types in a network through application software to cooperatively work through functions such as cluster application, a grid technology or a distributed file system, and provides data storage and service access functions to the outside, so that the safety of data is ensured, and the storage space is saved. Briefly, cloud storage is an emerging solution for putting storage resources on the cloud for human access. The user can conveniently access data at any time and any place through connecting to the cloud through any internet-connected device. The cloud storage system is composed of a plurality of storage devices, and provides certain types of storage services and access services for users through certain application software or application interfaces by combining functions such as a cluster function, a distributed file system or similar grid computing and the like to cooperatively work.
When a traditional cloud storage system is accessed to a storage device, the storage device is basically accessed through manual setting, for example, corresponding storage devices are set according to different data types and data volumes, and for example, when a user takes a picture on a certain network disk cloud storage mobile phone, a storage path needs to be manually selected. Since the user does not know all storage device characteristics of the currently used cloud storage system, nor the optimal selection when matching data characteristics with storage device characteristics, the storage device manually selected by the user for the data to be stored is often not the optimal storage device.
Disclosure of Invention
The invention provides a method and a device for dynamically accessing a storage device in a cloud storage system, which are used for solving the problems that the existing cloud storage system needs manual selection when selecting the storage device and the selection cannot be optimized. According to the scheme for dynamically accessing the storage equipment in the cloud storage system, a neural network is trained through a machine learning algorithm according to the data characteristics and the equipment characteristics stored in the storage equipment in the cloud storage system, and through the neural network, when new data storage requirements exist, the optimal storage equipment can be selected for the new data and the new data can be dynamically accessed into the optimal storage equipment.
The invention provides a method for dynamically accessing a storage device in a cloud storage system, which comprises the following steps:
acquiring a characteristic data set corresponding to each storage device in a cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system;
carrying out transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
acquiring a data characteristic value of data to be stored, and generating a characteristic data set corresponding to the data to be stored; the feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system;
inputting a characteristic data set corresponding to data to be stored into the neural network model for calculation, and determining the storage equipment according to an output result of the neural network model;
and dynamically accessing the data to be stored into the storage equipment for cloud storage.
In an optional embodiment, the performing migration learning on the feature data set corresponding to each storage device by using the Resnet18 network structure to obtain a neural network model includes:
the characteristic data set a corresponding to each storage deviceiTo input sample data, an output vector y is calculated according to the following formulai
Figure BDA0002756716240000021
Calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2A value of (d);
will be parameter W1、W2、c1、c2Substituting the above equation to obtain a neural network model represented by the following equation:
Figure BDA0002756716240000031
wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector; x and y are the input and output, respectively, of the neural network model;
the expression of the function σ () is:
Figure BDA0002756716240000032
the expression of the Sigmoid () function is:
Figure BDA0002756716240000033
presetting a loss function L (a)i,yi) The expression of (a) is:
Figure BDA0002756716240000034
biis an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0.
In an optional embodiment, the inputting the feature data set corresponding to the data to be stored into the neural network model for calculation, and determining the storage device according to the output result of the neural network model includes:
inputting a characteristic data set corresponding to data to be stored into the neural network model, and calculating to obtain an output vector y; wherein y is an n-dimensional column vector;
determining the row number N corresponding to the element with the maximum median value in the output vector y;
and determining the Nth storage device in the cloud storage system as the current storage device.
In an optional embodiment, before the inputting the feature data set corresponding to the data to be stored into the neural network model, the method further includes:
comparing the dimension s of the feature data set corresponding to the data to be stored with the dimension m of the input data required by the neural network model;
if s is equal to m, executing the step of inputting the characteristic data set corresponding to the data to be stored into the neural network model;
if m-k is less than or equal to s is less than or equal to m + k, and s is not equal to m, then W is the equationmsx 'transforming the characteristic data set x' corresponding to the data to be stored into a characteristic data set x with input data dimensionality required by the neural network model, then taking the characteristic data set x corresponding to the transformed data to be stored as the input of the neural network model, and executing the calculation to obtain an output vector y; wherein x' is a characteristic data set corresponding to the data to be stored, and is an s-dimensional column vector, x is an m-dimensional column vector, and W ismsIs a preset m multiplied by s dimension real matrix, and k is a preset integer;
when s is less than m-k, determining preset small data storage equipment as current storage equipment;
and when s is larger than m-k, determining the preset big data storage equipment as the current storage equipment.
In an optional embodiment, before obtaining the feature data set corresponding to each storage device in the cloud storage system, the method further includes: determining whether to adjust a storage rule of the storage device, the determining step comprising:
step A1: based on a historical database of a cloud storage system, calling an equipment log of each storage device in the cloud storage system, fitting the equipment log, determining whether the corresponding storage device works normally according to a fitting processing result, and calibrating the normally-working storage device;
step A2: calling a storage log of each calibrated storage device based on a historical database of a cloud storage system, performing cluster analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure BDA0002756716240000041
wherein Q represents the total class number of the subdata related to the storage log in the cluster set; kq1A sub data attribute value indicating a q 1-th type of sub data; k represents a storage attribute value of the storage equipment after corresponding calibration processing; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
step A3: judging whether the absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the calibrated storage device is within a preset difference value range or not;
if so, reserving the storage log corresponding to the storage equipment after the calibration processing, and meanwhile, continuously storing related target data according to the current storage rule corresponding to the storage equipment after the calibration processing;
otherwise, acquiring each type of subdata in the cluster set corresponding to the calibrated storage device, and extracting the first N1 maximum data volume proportion ratios based on the data volume proportion ratio of each type of subdata in the cluster set, wherein N1 is smaller than Q, and N1 and Q are both positive integers;
step A4: calculating the matching values P1 of the subdata attribute values corresponding to the N1 maximum data volume proportion ratios and the storage attribute of the storage device after the corresponding calibration processing according to the following formula;
Figure BDA0002756716240000051
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
step A5: and acquiring subdata with a matching value smaller than a preset value, storing the corresponding subdata into standby equipment, adjusting the current storage rule of the corresponding storage equipment subjected to calibration processing, and continuously storing related target data according to the adjusted storage rule.
The invention also provides a device for dynamically accessing the storage equipment in the cloud storage system, which comprises the following components:
the first acquisition module is used for acquiring a characteristic data set corresponding to each storage device in the cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system;
the learning module is used for performing transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
the second acquisition module is used for acquiring the data characteristic value of the data to be stored and generating a characteristic data set corresponding to the data to be stored; the feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system;
the calculation module is used for inputting the feature data set corresponding to the data to be stored into the neural network model for calculation, and determining the storage equipment according to the output result of the neural network model;
and the storage module is used for dynamically accessing the data to be stored into the storage equipment for cloud storage.
In an optional embodiment, the learning module includes:
first of allA computing unit for using the corresponding characteristic data set a of each storage deviceiTo input sample data, an output vector y is calculated according to the following formulai
Figure BDA0002756716240000061
A second calculation unit for calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2A value of (d);
a model establishing unit for establishing the parameter W calculated by the second calculating unit1、W2、c1、c2Is substituted into the formula used by the first calculation unit, resulting in a neural network model represented by the following formula:
Figure BDA0002756716240000062
wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector; x and y are the input and output, respectively, of the neural network model;
the expression of the function σ () is:
Figure BDA0002756716240000063
the expression of the Sigmoid () function is:
Figure BDA0002756716240000064
presetting a loss function L (a)i,yi) The expression of (a) is:
Figure BDA0002756716240000065
biis an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0.
In an optional embodiment, the calculation module includes:
the third calculation unit is used for inputting the feature data set corresponding to the data to be stored into the neural network model and calculating to obtain an output vector y; wherein y is an n-dimensional column vector;
the first determining unit is used for determining the number N of the corresponding rows of the elements with the maximum median of the output vector y;
and the second determining unit is used for determining the Nth storage device in the cloud storage system as the current storage device according to the N value determined by the first determining unit.
In an optional embodiment, the computing module further includes:
the comparison unit is used for comparing the dimension s of the feature data set corresponding to the data to be stored with the dimension m of the input data required by the neural network model;
a transformation unit for comparing m-k ≤ m + k and s ≠ m according to formula x ═ Wmsx 'transforms the feature data set x' corresponding to the data to be stored into a feature data set x with input data dimensionality required by the neural network model, and then provides the transformed feature data set x corresponding to the data to be stored to the third computing unit as input of the neural network model; wherein x' is a characteristic data set corresponding to the data to be stored, and is an s-dimensional column vector, x is an m-dimensional column vector, and W ismsFor preset mxs dimensionA matrix, wherein k is a preset integer;
the third calculating unit is specifically configured to, when the comparing unit compares s ═ m, input the feature data set corresponding to the data to be stored into the neural network model, and calculate to obtain an output vector; or the characteristic data set x corresponding to the transformed data to be stored provided by the transformation unit is input into the neural network model, and an output vector is obtained through calculation;
the second determining unit is specifically configured to determine a preset small data storage device as the current storage device when the comparing unit compares s < m-k; or when the comparison unit compares that s is larger than m-k, determining preset big data storage equipment as the current storage equipment; or determining the nth storage device in the cloud storage system as the current storage device according to the N value determined by the first determination unit.
In an optional embodiment, the apparatus further comprises: the storage rule adjusting module is used for determining whether to adjust the storage rule of the storage device; the storage rule adjusting module may include:
the calibration unit is used for calling the equipment log of each storage device in the cloud storage system based on a historical database of the cloud storage system, fitting the equipment log, determining whether the corresponding storage device works normally according to the fitting result, and calibrating the normally working storage device;
the clustering unit is used for calling the storage log of each calibrated storage device based on a historical database of a cloud storage system, performing clustering analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure BDA0002756716240000081
wherein Q represents the set of clustersThe total class number of the sub data related to the storage log in (1); kq1A sub data attribute value indicating a q 1-th type of sub data; k represents a storage attribute value of the storage equipment after corresponding calibration processing; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
the judging unit is used for judging whether an absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the storage equipment after calibration processing is within a preset difference value range or not;
the storage unit is used for reserving the storage log corresponding to the storage device after the calibration processing when the judgment result of the judgment unit is yes, and meanwhile, continuously storing the related target data according to the current storage rule corresponding to the storage device after the calibration processing;
an obtaining unit, configured to obtain each type of sub data in the cluster set corresponding to the storage device after the calibration processing if the determination result of the determining unit is negative, and extract top N1 maximum data volume fraction ratios based on the data volume fraction ratios of each type of sub data in the cluster set, where N1 is smaller than Q, and N1 and Q are both positive integers;
a matching value calculating unit, configured to calculate, according to the following formula, matching values P1 between the sub-data attribute values corresponding to the N1 maximum data size ratio proportions and the storage attribute of the corresponding calibrated storage device, respectively;
Figure BDA0002756716240000082
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
and the rule adjusting unit is used for acquiring the subdata with the matching value smaller than the preset value, storing the corresponding subdata into the standby equipment, adjusting the current storage rule of the corresponding storage equipment after calibration processing, and continuously storing the related target data according to the adjusted storage rule.
According to the method and the device for dynamically accessing the storage equipment in the cloud storage system, the characteristic data sets are subjected to transfer learning by utilizing a Resnet18 network structure through the data characteristics and the equipment characteristics of the stored data of the storage equipment in the cloud storage system to obtain a neural network model, and then the corresponding data characteristic values of the new data to be stored are obtained, so that the optimal storage equipment can be obtained through the neural network calculation, the storage equipment does not need to be manually selected, and the optimal storage equipment can be selected according to the characteristics of the data to be stored and can be dynamically accessed and stored.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of a method for dynamically accessing a storage device in a cloud storage system according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for dynamically accessing a storage device in a cloud storage system according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a first embodiment of a method for dynamically accessing a storage device in a cloud storage system according to the present invention;
fig. 4 is a schematic structural diagram of a second embodiment of a method for dynamically accessing a storage device in a cloud storage system according to the present invention;
fig. 5 is a schematic structural diagram of a third embodiment of a method for dynamically accessing a storage device in a cloud storage system according to the present invention;
fig. 6 is a schematic structural diagram of a fourth embodiment of a method for dynamically accessing a storage device in a cloud storage system according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 1 is a flowchart of a method for dynamically accessing a storage device in a cloud storage system according to an embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
s11: and acquiring a characteristic data set corresponding to each storage device in the cloud storage system.
The feature data set comprises data feature values of stored data of corresponding storage devices and device feature values of all storage devices in the cloud storage system. The data characteristics of the data can include the type of the data, the size of the data, the data format and other characteristics; device characteristics of a storage device may include total storage capacity of the device, remaining storage capacity of the device, ambient temperature of operation of the device, and the like.
For example: the current cloud storage system has 2 storage devices Y1、Y2By this step, the storage device Y is acquired1The data characteristic values of the stored data are: the data type is d1, and the data size is e 1; storage device Y1The total storage capacity of the device is g1, the residual storage capacity is g 1', and the working environment temperature of the device is T1; storage device Y2The data characteristic values of the stored data are: the data type is d2, and the data format is h 2; storage device Y2The total storage capacity of the device is g2, the residual storage capacity is g 2', and the working environment temperature of the device is T2; if define the storage device aiThe corresponding characteristic data set is (data type, data size, data format, device Y)1Storage capacity of, device Y1Residual storage capacity of, device Y1Working environment temperature of, apparatus Y2Storage capacity of, device Y2Residual storage capacity of, device Y2Working environment temperature) of the storage device Y can be obtained1The corresponding feature data set is a1(d1, e1,0, g1, g1 ', T1, g2, g 2', T2), storage device Y2The corresponding feature data set is a2=(d2,0,h2,g1,g1’,T1,g2,g2’,T2)。
Preferably, in the feature data set corresponding to the storage device, the number of the data features is greater than the number of the device features of all the storage devices in the cloud storage system, so as to increase the weight of the data features in machine learning.
S12: and (4) carrying out transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model.
In the embodiment, the operation amount during the neural network training is expected to be reduced, so the Resnet18 network structure is introduced, and Resnet18 can be regarded as a nonlinear mapping, namely for a given input variable z ∈ R256×256,1The output variable Resnet18(z) is a 100-dimensional column vector.
In an alternative embodiment, the step S12 can be specifically realized by the following steps S121 to S123:
s121: the characteristic data set a corresponding to each storage deviceiTo input sample data, an output vector y is calculated according to the following equation (1)i
Figure BDA0002756716240000111
Wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector;
the expression of the function σ () is:
Figure BDA0002756716240000112
the expression of the Sigmoid () function is:
Figure BDA0002756716240000113
s122: calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2The value of (c).
Preferably, the loss function L (a) is preseti,yi) The expression of (a) is:
Figure BDA0002756716240000114
wherein, biIs an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0. Suppose a storage device in a per-cloud storage system is Y1,Y2,…,Yn,bi∈{Y1,Y2,…,YnIs used to identify the ith storage device Yi. To calculate the loss function, biConverting into a one-hot code, for example, the one-hot code corresponding to the ith storage device is bi(0,0, …,0,1,0, …,0,0), i.e., the ith position is 1 and the remaining positions are 0. The multiplication in equation (2) is a hadamard multiplication and can also be understood as an inner product between vectors.
S123: the parameter W calculated in the previous step1、W2、c1、c2Substituting the value of (a) into the formula (1), a neural network model represented by the formula (3) is obtained:
Figure BDA0002756716240000121
in the formula (3), x and y are input and output of the neural network model respectively, the input x is an m-dimensional column vector, and the output y is an n-dimensional column vector.
S13: and acquiring a data characteristic value of the data to be stored, and generating a characteristic data set corresponding to the data to be stored.
The feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system.
The specific implementation method of this step is similar to the above step S11, the data feature value of the current data to be stored is obtained, and is combined with the device feature values of all the storage devices in the cloud storage system obtained in S11, so as to obtain a feature data set corresponding to the data to be stored.
S14: inputting a characteristic data set corresponding to data to be stored into a trained neural network model for calculation, and determining the storage equipment according to an output result of the neural network model.
In an alternative embodiment, step S14 may include the following steps S141-S143:
s141: inputting a characteristic data set x corresponding to data to be stored into a neural network model shown in a formula (3), and calculating to obtain an output vector y; wherein x is an m-dimensional column vector and y is an n-dimensional column vector;
s142: determining the row number N corresponding to the element with the maximum median value in the output vector y;
in this embodiment, it can be known from the formula of Sigmoid function that: the values of the elements of the output vector y of the previous step lie between 0 and 1.
In this step, a formula is adopted: and confirming the row number of the maximum element in y.
S143: and determining the Nth storage device in the cloud storage system as the current storage device.
In this embodiment, when the neural network model of formula (3) is calculated, f2Is a 100-dimensional column vector and,the value of each element of the output y is between 0 and 1, and actually, the neural network calculates the probability that the input data belongs to the existing storage device, so that the element with the maximum value of the output y is selected, namely, the element represents that the probability that the storage device corresponding to the element is used for storing the data to be stored corresponding to the input is the maximum, so that the optimal storage device corresponding to the current data to be stored can be automatically and intelligently selected by using the neural network according to the characteristics of the data to be stored and the characteristics of the storage devices in the cloud storage system.
S15: and dynamically accessing the data to be stored into the storage equipment for cloud storage.
In this embodiment, after the storage device is determined, data to be stored is dynamically accessed to the storage device for cloud storage, and manual selection of the storage device is not required.
In this embodiment, migration learning is performed on feature data sets by using a Resnet18 network structure through data features and device features of data stored in a storage device in a cloud storage system to obtain a neural network model, and then, a data feature value corresponding to new data to be stored is obtained, that is, an optimal storage device can be obtained through the neural network calculation without manually selecting a storage device, and the optimal storage device can be selected according to the features of the data to be stored and dynamically accessed for storage.
If the dimension of the feature data set corresponding to the data to be stored is not equal to the dimension of the input data required by the neural network model, the above method needs to be further improved, and the following detailed description is provided in the second embodiment.
Fig. 2 is a flowchart of a method for dynamically accessing a storage device in a cloud storage system according to a second embodiment of the present invention, and as shown in fig. 2, the method includes the following steps:
s201: and acquiring a characteristic data set corresponding to each storage device in the cloud storage system.
In this embodiment, the specific implementation method of step S201 is similar to step S11 described above, and is not described herein again.
S202: carrying out transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
in this embodiment, the specific implementation method of step S202 is similar to that of step S12, and is not described herein again.
S203: acquiring a data characteristic value of data to be stored, and generating a characteristic data set corresponding to the data to be stored;
in this embodiment, the specific implementation method of step S203 is similar to that of step S13, and is not described herein again.
S204: comparing the dimension s of the feature data set corresponding to the data to be stored with the dimension m of the input data required by the neural network model;
in this embodiment, if S ═ m, S205 is executed; if m-k is not less than S not more than m + k and S is not equal to m, go to step S206; if S < m-k, go to S210; if S > m-k, S211 is executed.
S205: inputting a feature data set corresponding to data to be stored into the neural network model, and executing step S207;
in this step, a feature data set corresponding to the current data to be stored is used as the neural network model shown in formula (3).
S206: transforming the feature data set x' corresponding to the data to be stored into a feature data set x with an input data dimension required by the neural network model according to formula (4), then inputting the transformed feature data set x into the neural network model, and executing step S207;
x=Wmsx′ (4)
wherein x' is a characteristic data set corresponding to original data to be stored, and is an s-dimensional column vector; x is a feature data set obtained by transforming a feature data set corresponding to original data to be stored, and is an m-dimensional column vector WmsIs a preset m multiplied by s dimension real matrix, and k is a preset integer.
The step is used for adding a hidden layer to the trained neural network structure when the dimension s of the feature data set corresponding to the data to be stored is close to the dimension m so as to perform dimension transformation on the input of the neural network, so that the dimension requirement input in the formula (3) is met.
S207: and calculating to obtain an output vector through the neural network.
In the step, the input vector is calculated through a neural network model shown in a formula (3) to obtain an output vector.
S208: and determining the row number N corresponding to the element with the maximum value in the output vector y.
In this embodiment, the specific implementation method of step S208 is similar to that of step S142, and is not described herein again.
S209: determining the nth storage device in the cloud storage system as the current storage device, and then executing step S212.
S210: determining the preset small data storage device as the current storage device, and then executing step S212
In this step, the preset small data storage device may be a storage device that is specified in advance from n storage devices in the cloud storage system, or alternatively, in the cloud storage system, in addition to the n storage devices, a new storage device is added as the small data storage device and is only used to store data with a small data size.
S211: the preset big data storage device is determined as the present storage device, and then step S212 is performed.
In this step, the preset big data storage device may be a storage device that is specified in advance from n storage devices in the cloud storage system, or a storage device with a larger capacity is newly added as the big data storage device in the cloud storage system in addition to the n storage devices, and is dedicated to store data with a larger data volume.
S212: and dynamically accessing the data to be stored into the storage equipment for cloud storage.
On the basis of the first embodiment, when the dimension of the feature data set corresponding to the data to be stored is not equal to the dimension of the input data required by the trained neural network model, a more complete method for dynamically accessing the storage device is provided.
In an optional embodiment, before obtaining the feature data set corresponding to each storage device in the cloud storage system, the method further includes: determining whether to adjust a storage rule of the storage device, the determining step comprising:
step A1: calling an equipment log of each storage device in the cloud storage system based on a historical database of the cloud storage system, determining whether the corresponding storage device works normally according to the equipment log, and calibrating the normally-working storage device; the calibration processing refers to marking;
step A2: calling a storage log of each calibrated storage device based on a historical database of a cloud storage system, performing cluster analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure BDA0002756716240000161
wherein Q represents the total class number of the subdata related to the storage log in the cluster set; kq1The subdata attribute value represents the subdata attribute value of the q 1-th subdata, and the subdata attribute value refers to the data volume size of subdata; k represents a storage attribute value of the storage device after corresponding calibration processing, and the storage attribute value refers to the size of a storage space; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
step A3: judging whether the absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the calibrated storage device is within a preset difference value range or not;
if so, reserving the storage log corresponding to the storage equipment after the calibration processing, and meanwhile, continuously storing related target data according to the current storage rule corresponding to the storage equipment after the calibration processing;
otherwise, acquiring each type of subdata in the cluster set corresponding to the calibrated storage device, and extracting the first N1 maximum data volume proportion ratios based on the data volume proportion ratio of each type of subdata in the cluster set, wherein N1 is smaller than Q, and N1 and Q are both positive integers;
step A4: calculating the matching values P1 of the subdata attribute values corresponding to the N1 maximum data volume proportion ratios and the storage attribute of the storage device after the corresponding calibration processing according to the following formula;
Figure BDA0002756716240000162
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
step A5: and acquiring subdata with a matching value smaller than a preset value, storing the corresponding subdata into standby equipment, adjusting the current storage rule of the corresponding storage equipment subjected to calibration processing, and continuously storing related target data according to the adjusted storage rule.
Corresponding to the method provided in the embodiment of the present invention, an embodiment of the present invention further provides an apparatus for dynamically accessing a storage device in a cloud storage system, as shown in fig. 3, which is a schematic structural diagram of a first apparatus for dynamically accessing a storage device in a cloud storage system provided in the embodiment of the present invention, as shown in fig. 3, the apparatus includes:
the first obtaining module 11 is configured to obtain a feature data set corresponding to each storage device in the cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system;
the learning module 12 is configured to perform transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
the second obtaining module 13 is configured to obtain a data characteristic value of the data to be stored, and generate a characteristic data set corresponding to the data to be stored; the feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system;
the calculation module 14 is configured to input the feature data set corresponding to the data to be stored into the neural network model for calculation, and determine the storage device according to an output result of the neural network model;
and the storage module 15 is configured to dynamically access the data to be stored to the storage device of this time for cloud storage.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 4 is a schematic structural diagram of a second apparatus for dynamically accessing a storage device in a cloud storage system according to an embodiment of the present invention, as shown in fig. 4, the apparatus of this embodiment is based on the apparatus structure shown in fig. 3, and further, the learning module 12 may include:
a first calculating unit 121, configured to use the feature data set a corresponding to each storage deviceiTo input sample data, an output vector y is calculated according to the following formulai
Figure BDA0002756716240000171
A second calculating unit 122 for calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2A value of (d);
a model establishing unit 123 for establishing the parameter W calculated by the second calculating unit 1221、W2、c1、c2Is substituted into the formula used by the first calculation unit 121, a neural network model represented by the following formula is obtained:
Figure BDA0002756716240000181
wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector; x and y are the input and output, respectively, of the neural network model;
the expression of the function σ () is:
Figure BDA0002756716240000182
the expression of the Sigmoid () function is:
Figure BDA0002756716240000183
presetting a loss function L (a)i,yi) The expression of (a) is:
Figure BDA0002756716240000184
biis an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 5 is a schematic structural diagram of a third embodiment of an apparatus for dynamically accessing a storage device in a cloud storage system, as shown in fig. 5, the apparatus of this embodiment is based on the apparatus structure shown in fig. 4, and further, the computing module 14 may include:
the third calculating unit 141 is configured to input a feature data set corresponding to data to be stored into the neural network model, and calculate to obtain an output vector y; wherein y is an n-dimensional column vector;
a first determining unit 142, configured to determine the number N of corresponding rows of the element with the largest median in the output vector y;
a second determining unit 143, configured to determine, according to the N value determined by the first determining unit 142, an nth storage device in the cloud storage system as the current storage device.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 6 is a schematic structural diagram of a fourth embodiment of an apparatus for dynamically accessing a storage device in a cloud storage system, as shown in fig. 6, the apparatus of this embodiment is based on the apparatus structure shown in fig. 5, and further, the computing module 14 further includes:
a comparing unit 144, configured to compare a dimension s of the feature data set corresponding to the data to be stored with a dimension m of the input data required by the neural network model;
a transformation unit 145, configured to, when the comparison unit 144 compares that m-k is less than or equal to s is less than or equal to m + k, and s is not equal to m, according to the formula x-Wmsx 'transforms the feature data set x' corresponding to the data to be stored into a feature data set x having an input data dimension required by the neural network model, and then provides the feature data set x corresponding to the transformed data to be stored to the third computing unit 141 as an input of the neural network model; wherein x' is a characteristic data set corresponding to the data to be stored, and is an s-dimensional column vector, x is an m-dimensional column vector, and W ismsIs a preset m multiplied by s dimension real matrix, and k is a preset integer;
the third calculating unit 141 is specifically configured to, when the comparing unit 144 compares s ═ m, input the feature data set corresponding to the data to be stored into the neural network model, and calculate to obtain an output vector; or the feature data set x corresponding to the transformed data to be stored provided by the transformation unit 145 is input to the neural network model, and an output vector is obtained through calculation;
the second determining unit 143, configured to determine a preset small data storage device as the current storage device when the comparing unit 144 compares s < m-k; or when the comparison unit 144 compares that s is greater than m-k, the preset big data storage device is determined as the current storage device; or determining the nth storage device in the cloud storage system as the current storage device according to the N value determined by the first determining unit 142.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.
In an optional embodiment, the apparatus further comprises: the storage rule adjusting module is used for determining whether to adjust the storage rule of the storage device; the storage rule adjusting module may include:
the calibration unit is used for calling the equipment log of each storage device in the cloud storage system based on a historical database of the cloud storage system, fitting the equipment log, determining whether the corresponding storage device works normally according to the fitting result, and calibrating the normally working storage device;
the clustering unit is used for calling the storage log of each calibrated storage device based on a historical database of a cloud storage system, performing clustering analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure BDA0002756716240000201
wherein Q represents the AND store log in the cluster setThe total number of classes of the associated sub data; kq1A sub data attribute value indicating a q 1-th type of sub data; k represents a storage attribute value of the storage equipment after corresponding calibration processing; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
the judging unit is used for judging whether an absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the storage equipment after calibration processing is within a preset difference value range or not;
the storage unit is used for reserving the storage log corresponding to the storage device after the calibration processing when the judgment result of the judgment unit is yes, and meanwhile, continuously storing the related target data according to the current storage rule corresponding to the storage device after the calibration processing;
an obtaining unit, configured to obtain each type of sub data in the cluster set corresponding to the storage device after the calibration processing if the determination result of the determining unit is negative, and extract top N1 maximum data volume fraction ratios based on the data volume fraction ratios of each type of sub data in the cluster set, where N1 is smaller than Q, and N1 and Q are both positive integers;
a matching value calculating unit, configured to calculate, according to the following formula, matching values P1 between the sub-data attribute values corresponding to the N1 maximum data size ratio proportions and the storage attribute of the corresponding calibrated storage device, respectively;
Figure BDA0002756716240000202
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
and the rule adjusting unit is used for acquiring the subdata with the matching value smaller than the preset value, storing the corresponding subdata into the standby equipment, adjusting the current storage rule of the corresponding storage equipment after calibration processing, and continuously storing the related target data according to the adjusted storage rule.
The beneficial effects of the above technical scheme are: whether the storage rule of the storage device is adjusted or not is determined, the storage effectiveness of the storage device can be adjusted in time, the follow-up dynamic access storage device is convenient to provide an effective data identification basis, the normal storage device is selected by fitting the device log, the storage reliability is guaranteed, whether the storage device is stored according to the current storage rule or not is determined by calculating the comprehensive storage value of the storage device and comparing the difference values, and the effective usability of the storage device can be improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for dynamically accessing a storage device in a cloud storage system is characterized by comprising the following steps:
acquiring a characteristic data set corresponding to each storage device in a cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system;
carrying out transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
acquiring a data characteristic value of data to be stored, and generating a characteristic data set corresponding to the data to be stored; the feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system;
inputting a characteristic data set corresponding to data to be stored into the neural network model for calculation, and determining the storage equipment according to an output result of the neural network model;
dynamically accessing the data to be stored into the storage equipment for cloud storage;
before obtaining a feature data set corresponding to each storage device in the cloud storage system, the method includes: determining whether to adjust a storage rule of the storage device, the determining step comprising:
step A1: based on a historical database of a cloud storage system, calling an equipment log of each storage device in the cloud storage system, fitting the equipment log, determining whether the corresponding storage device works normally according to a fitting processing result, and calibrating the normally-working storage device;
step A2: calling a storage log of each calibrated storage device based on a historical database of a cloud storage system, performing cluster analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure FDA0003051291200000011
wherein Q represents the total class number of the subdata related to the storage log in the cluster set; kq1A sub data attribute value indicating a q 1-th type of sub data; k represents a storage attribute value of the storage equipment after corresponding calibration processing; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
step A3: judging whether the absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the calibrated storage device is within a preset difference value range or not;
if so, reserving the storage log corresponding to the storage equipment after the calibration processing, and meanwhile, continuously storing related target data according to the current storage rule corresponding to the storage equipment after the calibration processing;
otherwise, acquiring each type of subdata in the cluster set corresponding to the calibrated storage device, and extracting the first N1 maximum data volume proportion ratios based on the data volume proportion ratio of each type of subdata in the cluster set, wherein N1 is smaller than Q, and N1 and Q are both positive integers;
step A4: calculating the matching values P1 of the subdata attribute values corresponding to the N1 maximum data volume proportion ratios and the storage attribute of the storage device after the corresponding calibration processing according to the following formula;
Figure FDA0003051291200000021
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
step A5: and acquiring subdata with a matching value smaller than a preset value, storing the corresponding subdata into standby equipment, adjusting the current storage rule of the corresponding storage equipment subjected to calibration processing, and continuously storing related target data according to the adjusted storage rule.
2. The method as claimed in claim 1, wherein the obtaining a neural network model by performing migration learning on the feature data set corresponding to each storage device using a Resnet18 network structure comprises:
the characteristic data set a corresponding to each storage deviceiFor inputting samplesData, calculating an output vector y according to the following formulai
Figure FDA0003051291200000022
Calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2A value of (d);
will be parameter W1、W2、c1、c2Substituting the above equation to obtain a neural network model represented by the following equation:
Figure FDA0003051291200000031
wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector; x and y are the input and output, respectively, of the neural network model;
the expression of the function σ () is:
Figure FDA0003051291200000032
the expression of the Sigmoid () function is:
Figure FDA0003051291200000033
presetting a loss function L (a)i,yi) The expression of (a) is:
Figure FDA0003051291200000034
biis an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0.
3. The method according to claim 2, wherein the inputting a feature data set corresponding to data to be stored into the neural network model for calculation, and determining the storage device according to an output result of the neural network model comprises:
inputting a characteristic data set corresponding to data to be stored into the neural network model, and calculating to obtain an output vector y; wherein y is an n-dimensional column vector;
determining the row number N corresponding to the element with the maximum median value in the output vector y;
and determining the Nth storage device in the cloud storage system as the current storage device.
4. The method for dynamically accessing a storage device in a cloud storage system according to claim 3, wherein before inputting the feature data set corresponding to the data to be stored into the neural network model, the method further comprises:
comparing the dimension s of the feature data set corresponding to the data to be stored with the dimension m of the input data required by the neural network model;
if s is equal to m, executing the step of inputting the characteristic data set corresponding to the data to be stored into the neural network model;
if m-k is less than or equal to s is less than or equal to m + k, and s is not equal to m, then W is the equationmsx 'transforming the feature data set x' corresponding to the data to be stored into the feature data set x with the input data dimension required by the neural network model, and then corresponding the transformed data to be storedThe characteristic data set x is used as the input of the neural network model, and the step of calculating to obtain an output vector y is executed; wherein x' is a characteristic data set corresponding to the data to be stored, and is an s-dimensional column vector, x is an m-dimensional column vector, and W ismsIs a preset m multiplied by s dimension real matrix, and k is a preset integer;
when s is less than m-k, determining preset small data storage equipment as current storage equipment;
and when s is larger than m-k, determining the preset big data storage device as the current storage device.
5. An apparatus for dynamically accessing a storage device in a cloud storage system, comprising:
the first acquisition module is used for acquiring a characteristic data set corresponding to each storage device in the cloud storage system; the characteristic data set comprises data characteristic values of stored data of corresponding storage devices and device characteristic values of all storage devices in the cloud storage system;
the learning module is used for performing transfer learning on the feature data sets corresponding to the storage devices by using a Resnet18 network structure to obtain a neural network model;
the second acquisition module is used for acquiring the data characteristic value of the data to be stored and generating a characteristic data set corresponding to the data to be stored; the feature data set corresponding to the data to be stored comprises data feature values of the data to be stored and device feature values of all storage devices in the cloud storage system;
the calculation module is used for inputting the feature data set corresponding to the data to be stored into the neural network model for calculation, and determining the storage equipment according to the output result of the neural network model;
the storage module is used for dynamically accessing the data to be stored into the storage equipment for cloud storage;
wherein the apparatus further comprises: the storage rule adjusting module is used for determining whether to adjust the storage rule of the storage device; the storage rule adjusting module may include:
the calibration unit is used for calling the equipment log of each storage device in the cloud storage system based on a historical database of the cloud storage system, fitting the equipment log, determining whether the corresponding storage device works normally according to the fitting result, and calibrating the normally working storage device;
the clustering unit is used for calling the storage log of each calibrated storage device based on a historical database of a cloud storage system, performing clustering analysis on the storage log of each calibrated storage device to obtain a cluster set, and calculating a comprehensive storage value Z corresponding to the calibrated storage device based on the cluster set and the following formula;
Figure FDA0003051291200000051
wherein Q represents the total class number of the subdata related to the storage log in the cluster set; kq1A sub data attribute value indicating a q 1-th type of sub data; k represents a storage attribute value of the storage equipment after corresponding calibration processing; deltaq1Representing the proportion of data quantity of the sub data of the q1 th class in the cluster set;
the judging unit is used for judging whether an absolute value difference value between the comprehensive storage value Z and a preset storage value corresponding to the storage equipment after calibration processing is within a preset difference value range or not;
the storage unit is used for reserving the storage log corresponding to the storage device after the calibration processing when the judgment result of the judgment unit is yes, and meanwhile, continuously storing the related target data according to the current storage rule corresponding to the storage device after the calibration processing;
an obtaining unit, configured to obtain each type of sub data in the cluster set corresponding to the storage device after the calibration processing if the determination result of the determining unit is negative, and extract top N1 maximum data volume fraction ratios based on the data volume fraction ratios of each type of sub data in the cluster set, where N1 is smaller than Q, and N1 and Q are both positive integers;
a matching value calculating unit, configured to calculate, according to the following formula, matching values P1 between the sub-data attribute values corresponding to the N1 maximum data size ratio proportions and the storage attribute of the corresponding calibrated storage device, respectively;
Figure FDA0003051291200000052
wherein N1 is 1,2, 3.., N1; kn1A sub data attribute value indicating the n 1-th type of sub data; deltan1Representing the proportion of data quantity of the n 1-th sub data in the cluster set;
and the rule adjusting unit is used for acquiring the subdata with the matching value smaller than the preset value, storing the corresponding subdata into the standby equipment, adjusting the current storage rule of the corresponding storage equipment after calibration processing, and continuously storing the related target data according to the adjusted storage rule.
6. The apparatus for dynamically accessing a storage device in a cloud storage system according to claim 5, wherein the learning module comprises:
a first computing unit for computing a characteristic data set a corresponding to each storage deviceiTo input sample data, an output vector y is calculated according to the following formulai
Figure FDA0003051291200000061
A second calculation unit for calculating a predetermined loss function L (a)i,yi) Minimum parameter W1、W2、c1、c2A value of (d);
a model establishing unit for establishing the parameter W calculated by the second calculating unit1、W2、c1、c2Is brought into saidThe formula used by the first calculation unit yields a neural network model represented by the following formula:
Figure FDA0003051291200000062
wherein, ai=(ai1,ai2,…,aim)TSetting a data characteristic value of stored data of an ith storage device and a set of device characteristic values of all storage devices in the cloud storage system, wherein i is 1,1, …, n is the total number of the storage devices in the cloud storage system, m is the number of parameters of a characteristic data set corresponding to each storage device, and n is the number of parameters of the characteristic data set corresponding to each storage device<m;f1、f2Is an intermediate output, f2Is a 100-dimensional column vector; w1∈R256×256,mRepresents W1Is a (256 × 256) × m dimensional matrix, W2∈Rn,100Represents W2Is an n x 100 dimensional matrix, c1∈R256×256,1Denotes c1Is a (256 × 256) -dimensional column vector, c2∈RnDenotes c2Is an n-dimensional column vector; x and y are the input and output, respectively, of the neural network model;
the expression of the function σ () is:
Figure FDA0003051291200000063
the expression of the Sigmoid () function is:
Figure FDA0003051291200000071
presetting a loss function L (a)i,yi) The expression of (a) is:
Figure FDA0003051291200000072
biis an n-dimensional row vector with the ith position element value equal to 1 and the other position element values equal to 0.
7. The apparatus for dynamically accessing a storage device in a cloud storage system according to claim 6, wherein the computing module comprises:
the third calculation unit is used for inputting the feature data set corresponding to the data to be stored into the neural network model and calculating to obtain an output vector y; wherein y is an n-dimensional column vector;
the first determining unit is used for determining the number N of the corresponding rows of the elements with the maximum median of the output vector y;
and the second determining unit is used for determining the Nth storage device in the cloud storage system as the current storage device according to the N value determined by the first determining unit.
8. The apparatus for dynamically accessing a storage device in a cloud storage system according to claim 7, wherein the computing module further comprises:
the comparison unit is used for comparing the dimension s of the feature data set corresponding to the data to be stored with the dimension m of the input data required by the neural network model;
a transformation unit for comparing m-k ≤ m + k and s ≠ m according to formula x ═ Wmsx 'transforms the feature data set x' corresponding to the data to be stored into a feature data set x with input data dimensionality required by the neural network model, and then provides the transformed feature data set x corresponding to the data to be stored to the third computing unit as input of the neural network model; wherein x' is a characteristic data set corresponding to the data to be stored, and is an s-dimensional column vector, x is an m-dimensional column vector, and W ismsIs a preset m multiplied by s dimension real matrix, and k is a preset integer;
the third calculating unit is specifically configured to, when the comparing unit compares s ═ m, input the feature data set corresponding to the data to be stored into the neural network model, and calculate to obtain an output vector; or the characteristic data set x corresponding to the transformed data to be stored provided by the transformation unit is input into the neural network model, and an output vector is obtained through calculation;
the second determining unit is specifically configured to determine a preset small data storage device as the current storage device when the comparing unit compares s < m-k; or when the comparison unit compares s > m-k, determining preset big data storage equipment as the current storage equipment; or determining the nth storage device in the cloud storage system as the current storage device according to the N value determined by the first determination unit.
CN202011204997.4A 2020-11-02 2020-11-02 Method and device for dynamically accessing storage equipment in cloud storage system Active CN112506423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011204997.4A CN112506423B (en) 2020-11-02 2020-11-02 Method and device for dynamically accessing storage equipment in cloud storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011204997.4A CN112506423B (en) 2020-11-02 2020-11-02 Method and device for dynamically accessing storage equipment in cloud storage system

Publications (2)

Publication Number Publication Date
CN112506423A CN112506423A (en) 2021-03-16
CN112506423B true CN112506423B (en) 2021-07-20

Family

ID=74954964

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011204997.4A Active CN112506423B (en) 2020-11-02 2020-11-02 Method and device for dynamically accessing storage equipment in cloud storage system

Country Status (1)

Country Link
CN (1) CN112506423B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925793B (en) * 2021-03-29 2023-12-29 北京赛博云睿智能科技有限公司 Distributed hybrid storage method and system for multiple structural data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103543959A (en) * 2013-10-08 2014-01-29 深圳市国泰安信息技术有限公司 Method and device for mass data caching
CN107832458A (en) * 2017-11-27 2018-03-23 中山大学 A kind of file classification method based on depth of nesting network of character level
CN107885464A (en) * 2017-11-28 2018-04-06 北京小米移动软件有限公司 Date storage method, device and computer-readable recording medium
CN109101994A (en) * 2018-07-05 2018-12-28 北京致远慧图科技有限公司 A kind of convolutional neural networks moving method, device, electronic equipment and storage medium
CN109582234A (en) * 2018-11-23 2019-04-05 金色熊猫有限公司 Storage resources distribution method, device, electronic equipment and computer-readable medium
CN110928484A (en) * 2018-09-19 2020-03-27 上海仪电(集团)有限公司中央研究院 Hybrid cloud storage method based on software defined storage
CN111538458A (en) * 2018-12-31 2020-08-14 爱思开海力士有限公司 Memory device performance optimization using deep learning
CN111722806A (en) * 2020-06-19 2020-09-29 华中科技大学 Cloud disk allocation method and device, electronic equipment and storage medium
US20200341943A1 (en) * 2019-04-25 2020-10-29 Western Digital Technologies, Inc. Intelligent Data Access Across Tiered Storage Systems

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104503721A (en) * 2014-12-22 2015-04-08 重庆文理学院 Mixed band mathematic model based on fitting approximation algorithm
US10360214B2 (en) * 2017-10-19 2019-07-23 Pure Storage, Inc. Ensuring reproducibility in an artificial intelligence infrastructure
US11126666B2 (en) * 2019-03-20 2021-09-21 Verizon Media Inc. Temporal clustering of non-stationary data
CN112866260A (en) * 2020-08-27 2021-05-28 黄天红 Flow detection method combining cloud computing and user behavior analysis and big data center

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103543959A (en) * 2013-10-08 2014-01-29 深圳市国泰安信息技术有限公司 Method and device for mass data caching
CN107832458A (en) * 2017-11-27 2018-03-23 中山大学 A kind of file classification method based on depth of nesting network of character level
CN107885464A (en) * 2017-11-28 2018-04-06 北京小米移动软件有限公司 Date storage method, device and computer-readable recording medium
CN109101994A (en) * 2018-07-05 2018-12-28 北京致远慧图科技有限公司 A kind of convolutional neural networks moving method, device, electronic equipment and storage medium
CN110928484A (en) * 2018-09-19 2020-03-27 上海仪电(集团)有限公司中央研究院 Hybrid cloud storage method based on software defined storage
CN109582234A (en) * 2018-11-23 2019-04-05 金色熊猫有限公司 Storage resources distribution method, device, electronic equipment and computer-readable medium
CN111538458A (en) * 2018-12-31 2020-08-14 爱思开海力士有限公司 Memory device performance optimization using deep learning
US20200341943A1 (en) * 2019-04-25 2020-10-29 Western Digital Technologies, Inc. Intelligent Data Access Across Tiered Storage Systems
CN111722806A (en) * 2020-06-19 2020-09-29 华中科技大学 Cloud disk allocation method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于负载特征表示的云资源管理算法研究;刘春红;《中国博士学位论文全文数据库》;20180915(第09期);I139-13 *

Also Published As

Publication number Publication date
CN112506423A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN110363387B (en) Portrait analysis method and device based on big data, computer equipment and storage medium
WO2020140633A1 (en) Text topic extraction method, apparatus, electronic device, and storage medium
CN112465138A (en) Model distillation method, device, storage medium and equipment
CN111159563B (en) Method, device, equipment and storage medium for determining user interest point information
CN110188825B (en) Image clustering method, system, device and medium based on discrete multi-view clustering
CN110751326A (en) Photovoltaic day-ahead power prediction method and device and storage medium
CN113239176B (en) Semantic matching model training method, device, equipment and storage medium
CN110765882A (en) Video tag determination method, device, server and storage medium
CN112506423B (en) Method and device for dynamically accessing storage equipment in cloud storage system
CN111062428A (en) Hyperspectral image clustering method, system and equipment
CN116091113A (en) Marketing model data processing method, system and computer readable storage medium
CN107480621B (en) Age identification method based on face image
CN111552811A (en) Method and device for information completion in knowledge graph, computer equipment and storage medium
US9147162B2 (en) Method for classification of newly arrived multidimensional data points in dynamic big data sets
CN109086386B (en) Data processing method, device, computer equipment and storage medium
JP2021033994A (en) Text processing method, apparatus, device and computer readable storage medium
JP5970578B2 (en) Program and apparatus for determining relational model
CN111552810A (en) Entity extraction and classification method and device, computer equipment and storage medium
CN113947185B (en) Task processing network generation method, task processing device, electronic equipment and storage medium
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
WO2022217715A1 (en) Similar patient identification method and apparatus, computer device, and storage medium
CN113064554B (en) Optimal storage node matching method, device and medium based on distributed storage
CN115118559A (en) Sparse channel estimation method, device, equipment and readable storage medium
CN114118216A (en) Data receiving method and system capable of intelligently adjusting receiving conditions and electronic equipment
CN114298327A (en) Data processing method and device of federal learning model and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20211124

Granted publication date: 20210720

PP01 Preservation of patent right